UNCLASSIFIED 


u«if>  Ct_  ami'iCATiOm  O'  ’ml  '<01  rm >—  Dm  bm^i 


ejKd  msmucnows 

>«fT>IIE  COMPLETING  KOWM 
I'llNT'l  CATAkOC 


REPORT  DOCUMENTATION  PACE 


Dependency  Directed  Reasoning  for  Complex 
Program  Understanding  .  r- — — — — — — 


Howard  El  1  iot/Shrob< 


|D«M|L|H(HT  MOilCT  t»b 
*  •  MO««  UNIT  NUMliai 


•AM'OMM'MG  OMGAnUATIOM  NAMC  AMO  AOOSttt 

Artificial  Intelligence  Laboratory 
5*»5  Technology  Square 
Cambridge,  Massachusetts  02139 


COM*»Ot.V.(MG  0"lCt  NIKI  AMO  <00*111 

Advanced  Research  Projects  Agency 
I  LOO  W i I  son  B I vd 
Arlington,  Virginia  22209 

w  ;  *»  *  :«  AStsCy  n*m(  i  AOOROVU  dillmrmti  from  C  anfrelli 


Office  of  Nava)  Research 
Information  Systems 
Arlington,  Virginia  22217 


UNCLASSIFIED 


ocument  is  unlimited 


D«rii|jTiOM  ITATIMIN 


'•  IdPPtCiiCMrARv  noth 


None 


MT  t*B**lfr  *r 

Automated  Deduction 
Dependency  Networks 
Truth  Maintenance 


■  ft  »0»0«  rC«tiN«w«  r* ff  aatMMrf 

Program  Analysis 
Program  Understanding 
Program  Verification 
Programmer 's  Apprentice 


UNCLASSIFIED 

V  Cl  AAU'lCATiOM  O'  TMI*  *A0«  l< 


Block  20  continued: 


well  enough  to  modify  them. 

There  Is  also  a  complexity  barrier  In  the  world  of  coomterlcal  software  which  la  making  the 
cost  of  software  production  and  maintenance  prohibitive.  Here  too  a  system  which  Is  capable 
of  understanding  complex  programs  Is  a  necessary  step.  The  Programmer's  Apprentice  Project 
(Rich  and  Shrobe,  76)  Is  attempting  to  develop  an  Interactive  programming  tool  which  will 
help  expert  prograimsersdeal  with  the  complexity  Involved  in  engineering  a  large  software 
system. 

This  report  describes  REASON,  the  deductive  component  of  the  programmer's  apprentice.  REASON 
Is  Intended  to  help  progranmters  in  the  process  of  evolut lonary  program  design.  REASON 
utilizes  the  engineering  techniques  of  modelling,  decomposition,  and  analysis  by  Inspection 
to  determine  how  modules  Interact  to  achieve  the  desired  overall  behavior  of  a  program. 

REASON  coordinates  Its  various  sources  of  knowledge  by  using  a  dependency-directed  structure 
which  records  the  justification  for  each  deduction  it  makes.  Once  a  program  has  been 
analyzed,  these  Justifications  can  be  sutmaarlzed  Into  a  teleological  structure  called  a 
plan  which  helps  the  system  understand  the  Impact  of  a  proposed  program  modification. 


y  r 

* 


This  report  describes  research  done  at  the  Artificial  Intelligence  Laboratory  of 
the  Massachusetts  Institute  of  Technology.  Support  for  the  laboratory's  artificial 
intelligence  research  is  provided  in  part  by  the  Advanced  Research  Projects  Agency 
of  the  Department  of  Defense  under  Office  of  Naval  Research  contract  N0001A-75-C-064 3 . 


Dependency  Directed  Reasoning 


Por 

Complex  Program  Understanding 


by 

Howard  Elliot  Shrobe 

Massachusetts  Institute  of  Technology 
April  1979 


Revised  version  of  a  dissertation  submitted  on  August  31,  1978  to  the  Department  of 
Electrical  Engineering  and  Computer  Science  in  partial  fulfillment  of  the  requirements 
for  the  degree  of  Doctor  of  Philosophy. 


Abatraet 


Artificial  Intelligence  research  involves  the  creation  of  extremely  complex  programs 
which  must  possess  the  capability  to  introspect,  learn,  and  improve  their  expertise. 
Any  truly  intelligent  program  must  be  able  to  create  procedures  and  to  modify  them 
as  it  gathers  information  from  its  experience  -fSuwman,  1975]  produced  such  a  system 
for  a  "mini- world";  but  truly  intelligent  programs  must  be  considerably  more  complex. 
A  crucial  stepping  stone  in  AI  research  is  the  development  of  a  system  which  can 
understand  complex  programs  well  enough  to  modify  them. 

There  is  also  a  complexity  barrier  in  the  world  of  commercial  software  which  is 
making  the  cost  of  software  production  and  maintenance  prohibitive.  Here  too  a 
system  which  is  capable  of  understanding  complex  programs  is  a  necessary  step.  The 
Programmer’s  Apprentice  Project  -fftteb  and  Shrobe,  76]  is  attempting  to  develop  an 
interactive  programming  fool  which  will  help  expert  programmers  deal  with  the 
complexity  involved  in  engineering  a  large  software  system 

This  report  describes  REASON,  the  deductive  component  of  the  programmer's 
apprentice.  REASON  is  intended  to  help  expert  programmers  in  the  process  of 
evolutionary  program  design.  REASON  utilizes  the  engineering  techniques  of 
modelling,  decomposition,  and  analysis  by  inspection  to  determine  how  modules 
interact  to  achieve  the  desired  overall  behavior  of  a  program  REASON  coordinates 
its  various  sources  of  knowledge  by  using  a  dependency-directed  structure  which 
records  the  justification  for  each  deduction  it  makes.  Once  a  program  has  been 
analyzed  these  justifications  can  be  summarized  into  a  teleological  structure  called  a 
plan  which  helps  the  system  understand  the  impact  of  a  proposed  program 
modification. 


Acknowledgements 


1  would  have  found  this  research  impossible  to  conduct  had  it  not  been  for  the 
encouragement  and  cooperation  given  me  by  my  supervisor  Gerald  Jay  Sussman,  my 
readers  (and  mentors)  Carl  Hewitt  and  Marvin  Minsky  and  my  close  colleagues  Johan 
deKleer,  Jon  Doyle,  Mark  Miller,  Charles  Rich,  and  Richard  Waters.  But  most  of  all 
I  would  never  have  finished  this  work  if  it  hadn't  been  for  Annie  who  set  up  one 
house  while  the  other  almost  came  tumbling  down 


C0NTENT8 


1.  The  Importance  of  Program  Understanding  .  1 

1. 1  The  Problem  of  Program  Maintenance  .  4 

1.2  An  Imagined  Scenario .  7 

1.3  The  Research  Content  of  This  Thesis  .  14 

2.  An  Engineering  Theory  of  Evolutionary  Design . 17 

2.1  Type  of  Programs  —  Algorithms  vs.  Systems  .  18 

2.2  What  Characterizes  Evolutionary  Change’  .  20 

2. 3  Why  Is  Evolutionary  Design  Necessary?  .  22 

2.4  What  Do  Engineers  Do’  .  26 

2.5  Plans  and  Teleology  . 30 

2.6  Representing  Plans  .  31 

2.7  Plans  in  Maintenance  and  Explanation  .  45 

2.8  Dependency  Directed  Reasoning .  47 

3.  The  Reasoning  System  .  49 

3. 1  Dependencies  and  Justifications .  49 

4.  Explicit  Control  and  The  Task  Network  .  57 

4. 1  Hypothetical  .  66 

4.2  The  Rules  Of  Inference  .  73 

4.  3  Closing  The  Reflexive  Loop .  85 

4.4  Equality,  Reference  and  Anonymous  Objects  .  86 

4.5  Situational  Logic .  92 


S.  Describing  Programs 


5. 1  Specs  -  I/O  Descriptions  . 

5.2  Plan  Diagrams  . 

6.  A  Symbolic  Interpreter  for  Plan  Diagrams  ... 

7.  An  Example  ol  Symbolic  Interpretation  ....... 

8.  The  Temporal  Viewpoint  . 

8. 1  A  Paradigmatic  Example  . 

8.2  Situations  and  Orderings  . 

8.  .1  Temporal  Collections . 

8.4  Temporal  Collections  Inputs  and  Outputs  ... 

8.5  Temjioral  Collection  Data  flows  . 

9.  The  Recognition  Paradigm  . 

9.1  Abstract  Flows,  Data  and  Control  Pathways 

9.2  Summary  . 

10.  Description  of  Data-Structures  . . 

1(1 1  The  Data- Description  Language . 

10.2  Parameterized  Object  Descriptions  . 

10  3  Implementation  and  Virtual  Objects . 

104  A  Catalogue  of  Object  Descriptions  . 

11.  Reasoning  About  Side-effects  . 

11. 1  Specifying  Side-effects  . 

11.2  Reasoning  About  Simple  Side  Effects . 

11.3  Safe  from  and  Not  Safe-from  . 

11.4  More  Complicated  Effects  . 

11.5  Determining  What's  Affected  . 

11.6  An  Example  . 

11.7  Pseudo  Parallelism  . 


12.  Reducing  Complexity  in  Side  Effect  Analytic  .  235 

13.  Reasoning  About  Program  Modifications  .  241 

1 3. 1  Updating  The  Recognition  Map- .  245 

14.  Conclusions  .  254 

14.1  Good  Decisions  .  254 

14.2  Problems  .  257 

14.3  Future  Directions  .  259 

15.  A  Surrey  of  Related  Work  .  261 

15.1  Newer  Areas  of  Verification  Research .  267 

15.2  Apprentice- Like  Systems  .  271 

1 5. 3  •  Dependency  Based  Reasoning  .  274 

16.  Bibliography  .  276 


Chapter  1:  The  Importance  of  Program  Understanding 


There  is  a  fundamental  distinction  between  running  a  program  to  do  something 
and  asking  it  to  understand  how  it  accomplishes  that  very  same  task.  Even  huge 
programs,  like  MACSYMA  [Vlacsyma  19751  lack  the  ability  to  introspect,  to  examine 
their  own  procedures.  But,  lacking  the  ability  to  introspect,  they  also  lack  the  ability 
to  describe  their  own  behavior,  to  modify  their  behavior,  or  to  rationally  plan  the 
allocation  of  internal  resources  among  competing  tasks.  To  do  such  tasks,  a  program 
must  understand  itself. 

To  understand  itself  a  program  must  be  able  to  understand  programs.  Artificial 
Intelligence  programs  are  large  and  complex;  they  maintain  large  knowledge-bases,  use 
multiple  layers  of  interpreters,  and  frequently  employ  advanced  control  structures. 
This  thesis  attempts  to  formalize  and  represent  some  of  the  knowledge  necessary  to 
understand  and  explain  such  programs  It  is  incomplete  and  exploratory;  more 
questions  are  raised  than  answered.  However,  I  believe  that  several  important 
advances  are  made,  and  hope  that  this  work  may  serve  as  a  bridge  from  past 
exploratory  work  of  (Minsky,  19681  (McCarthy,  1968]  and  (Sussman,  1975]  to  future 
work  on  truly  self-conscious  systems  such  as  proposed  in  (Doyle,  19781 

How  can  a  program  understand  another  program?  In  a  step  towards  this, 
Sussman  (Sussman,  1975]  introduced  a  paradigm  of  how  an  intelligent  computer 
program  can  acquire  new  skills  In  this  paradigm,  a  planning  program  first  attempts 
to  combine  old  ones  procedures  to  form  a  "first  order"  approximation  to  a  desired 
complex  goal.  This  new  procedure  is  then  executed  in  a  "careful  mode",  maintaining 
a  record  of  the  process.  If  one  of  several  ire-defined  kinds  of  "bugs”  is  recognized, 
the  system  attempts  to  analyze  the  bug  and  to  debug  its  procedure  so  as  to  achieve 
the  desired  goaL 

Why  is  it  not  enough  for  a  program  just  to  understand  its  subject  matter?  What 
makes  us  want  to  have  it  "understand  itself  as  well7  Even  among  the  earliest  works 
of  A  I,  the  isme  of  self-consciousness  was  raised.  McCarthy’s  proposal  for  the  “Advice 
Taker"  (McCarthy,  1968]  was  in  essence  a  proposal  to  develop  a  system  which 
understands  its  procedures  well  enough  to  be  told  how  to  employ  them  effectively,  Le. 
to  take  advice.  Although  many  of  McCarthy's  original  plans  fell  short  of  the  mark 
there  were  many  seminal  ideas  present  as  welL  Similarly  Minsky's  Matter.  Mind  and 
Models  (Minsky,  1968]  raised  many  of  these  issues.  Sussman’s  Hacker  was  the  first 
substantially  detailed  system  to  exhibit  a  serious  approach  to  self-conscious  acquisition 


2  The  Importance  of  Program  Understanding 
of  knowledge. 

HACKER  could  be  considered  introspective  in  a  limited  sense.  It  knew  what  it 
had  done,  why  it  had  done  it,  and  what  higher  level  goal  each  action  was  intended  to 
achieve.  Finally,  it  could  examine  the  procedure  it  had  coded  and  modify  this 
procedure's  code. 

But  HACKER  was  very  limited  in  its  expertise.  It  knew  only  about  the  blocks 
world,  a  tniniworld  with  a  one  armed  robot,  a  collection  of  blocks,  and  simple  goals 
like  building  a  tower.  Such  simplicity  is  crucial  in  the  initial  stages  of  scientific 
exploration;  and  HACKER  remains  an  important  milestone.  However,  to  progress  we 
must  be  capable  of  engineering  a  system  which  can  understand  procedures  of 
complexity  greater  than  those  of  HACKER.  We  need  a  system  with  greater  expertise 
about  programs! 

Follow-up  work,  unfortunately,  has  not  developed  very  far  in  this  direction,  until 
recently.  Goldstein  (Goldstein,  1 9741  in  his  MYCROFT  system  attempted  to  present 
a  more  sophisticated  taxonomy  of  procedures  and  bugs,  but  he  also  worked  in  a  mini¬ 
world,  the  domain  of  graphics  programs  written  by  elementary  school  children. 
Sacerdoti  (Sacerdoti,  197?)  and  Waldinger  [Waldinger,  1977)  both  did  further  work  on 
the  simultaneous  sulvgoal  problem,  the  most  studied  bug  in  HACKER's  repertoire. 
However,  none  of  these  systems  would  meet  the  criterion  of  being  experts  at 
programming 

This  thesis  will  investigate  the  expertise  needed  to  represent  and  understand  AI 
programs.  It  will  also  present  a  reasoning  system  which  knows  what  it  is  doing  at 
each  step  and  which  uses  this  information  in  deciding  what  to  do  next  This  seems  to 
me  to  be  a  basic  step  toward  systems  which  are  capable  of  modifying  and  changing 
their  own  procedures.  (Doyle,  1978b)  proposes  to  investigate  such  architecture 
thoroughly.  My  work  should  be  seen  as  attacking  some  of  the  preliminary  technical 
matters  of  the  investigation;  it  is  not  a  solution,  but  merely  a  stepping  stone  along  the 
route  to  self-conscious  systems. 

What  is  needed?  Here  is  an  example  of  an  episode  in  which  I  described  -  to 
myself  (and  my  tape  recorder)  -  a  plan  for  developing  a  certain  system: 


Dependency  Directed  Reasoning 


1  The  Importance  of  Program  Understanding  3 


"This  program  is  going  to  be  doing  a  network  type  search.  So  the 
main  feature  of  the  implementation  will  be  a  Conniver-like  data-base 
which  in  this  case  doesn't  need  contexts.  There  will  be  a  demon  feature 
where  we  will  allow  arbitrary  number  of  patterns  in  each  demon.  An 
assertion  will  be  a  nested  list  structure;  most  of  them  will  be  simple. 
They'll  describe  the  various  kinds  of  relations. 

The  data-base  will  take  an  assertion  and  regard  it  as  a  treelike 
structure,  moving  through  the  structure  with  the  standard  tree  traversal 
until  it  reaches  each  terminal  node.  Each  terminal  will  be  indexed  by  a 
combination  of  its  unique  identifier  (the  MacLisp  function  Maknum)  and 
its  position  in  the  list  structure.  Indexing  means  calculating  a  bucket  to 
put  the  assertion  into  Position  is  calculated  by  bit  patterns  which  Til 
describe  later.  These  two  numbers  are  combined  to  form  one  number  and 
this  is  used  to  calculate  an  index  into  the  array.  You  do  this  for  each 
terminal  node  and  insert  the  assertion  in  each  bucket  you  get  The  bucket 
will  be  in  increasing  order,  so  the  insert  will  be  an  ordered-splice-ia" 

What  expertise  and  knowledge  are  involved  in  understanding  this  description? 
Certainly  one  must  know  something  about  hash-tables,  arrays,  list  structure,  recursion 
etc  But  what  about  these  structures  is  it  important  to  know?  One  has  to  reason 
about  the  behavior  of  such  objects  Evidently,  I  do  this  by  making  references  to 
standard  kinds  of  procedures  like  (WMMo-seitct- 1«  and  mi-mvmxc  What  are  these 
cliches?  They  seem  to  be  more  than  just  particular  patterns  of  code,  they  are  talked 
about  as  if  they  have  a  more  abstract  quality.  How  can  we  represent  these  "abstract 
procedures"7  What  are  the  rules  for  their  combination7  Certainly  any  system  which 
hopes  to  understand  complex  procedures  must  have  answers  to  some  of  these  questions. 

These  tasks  seem  hard  and  complex  and  one  might  well  wonder  whether  this 
research  will  produce  a  practical  payoff  somewhere  in  the  not  too  distant  future.  I 
believe  it  will.  Program  understanding  is  crucial  not  only  to  A  I,  but  to  Computer 
Science  and  the  computer  industry  as  well.  The  rapidly  growing  power  of 
computational  hardware,  has  led  to  new  demands  for  qualitatively  more  complex 
software.  Commercial  software  production  is  reaching  a  "complexity  barrier" 
(Winograd,  1  <>73).  I  believe  that  this  barrier  can  only  be  escaped  by  using  the 
computer  as  an  intelligent  and  sophisticated  support  system  for  the  expert  programmer. 


For  Complex  Program  Understanding 


4  The  Importance  of  Program  Understanding 


Seotion  1.1:  Th«  Problem  of  Program  Maintonanoa 
Notes  of  A  Beleaguered  Systems  Programmer 

Software  and  "software  maintenance",  has  become  the  major  expense  of 
computation  As  machines  grew  larger,  faster,  and  cheaper,  the  programs  which  run 
on  those  machines  have  grown  more  ambitious  and  complex.  But  the  programmers 
tools  for  maintaining  software  have  not  kept  pace  with  this  growth 

Why  has  "maintenance*  become  so  important?  The  word  is  probably  a  misnomer 
which  covers  up  the  real  issue:  the  evolutionary  nature  of  the  programmer  -  user 
relationship  Specifications  for  large  systems  are  frequently  incomplete  and  unclear, 
the  user  doesn't  know  exactly  what  he  wants.  Given  fuixy  criteria  the  designer  does 
the  best  he  can,  guessing  here,  making  temporary  choices  there 

Once  a  program  reaches  the  stage  of  initial  implementation  new  desiderata  are 
almost  always  discovered;  "This  report  should  have  these  3  extra  fields;  that  one 
provides  extraneous  information."  New  hardware  becomes  available  resulting  in 
changes  in  the  requirements  and  new  opportunities  for  improvements.  In  addition,  the 
currently  available  features  suggest  new  ones  which  could  be  implemented  if  only 
certain  modifications  were  made. 

So  while  the  first  implementation  is  running,  work  is  started  on  adding  features 
and  reworking  the  last  implementation  Running  experience  reveals  the  existence  of 
wine  new  bugs  which  force  additional  redesign  In  this  process  the  programmer  again 
and  again  finds  himself  trying  to  remember  whether  it  is  safe  to  "smash  the  record* 
before  it  is  stored,  whether  any  module  is  using  the  second  bit  of  the  dispatch  queue 
entry,  etc  In  general  he  is  forced  to  consider  all  possible  places  which  might  be 
affected  by  any  proposed  change  Of  course,  one  does  what  one  can  and  version  two 
eventually  appears. 

At  this  point,  the  user  and  the  programmer  notice  that  there  are  new  features*  a 
brand  new  terminal  which  would  allow  real  time  interaction,  and  of  course  the 
inevitable  bugs.  So  while  version  two  is  being  run,  the  design  evolves  again;  version 
three  is  laid  out  on  the  drawing  board.  And  so  on... 


Dependency  Directed  Reasoning 


1. 1  The  Problem  of  Program  Maintenance  S 


If  the  production  of  software  is  not  to  be  halted  by  these  problems,  new  help 
must  be  found.  Computer  science  currently  has  two  types  of  solutions;  we  shall 
propose  a  third.  The  first  type  of  solution,  disciplined  programming,  consists  of 
improvements  which  avoid  an  automation  of  intelligent  human  skills.  These  include 
languages  such  as  CLU  [Liskov,  eL  aL,  19771  ALPHARD  (Wulf,  et  aL,  19761  etc. 
which  attempt  to  reflect  the  programmer's  intent  in  the  code  and  to  modularize  the 
system  so  that  dependencies  are  localized.  In  this  kind  of  programming  methodology, 
errors  are  minimized  and  some  modifications  of  the  software  become  simpler.  Other 
efforts  ^hor*  of  automation  of  human  skills  include  the  editors,  debugging  tools,  and 
the  like  which  systems  like  INTERLISP  (Teitelman,  1975,77}  and  The  Programmer's 
Workbench  (Dolotta,  1976)  have  packaged  into  integrated  language  support  facilities. 

At  the  other  extreme  is  the  proposal  to  automate  the  programming  process  itself, 
removing  the  programmer  from  all  but  the  most  high-level  design  decisions.  Automatic 
programming  (Balzer,  1975},  however,  assumes  a  highly  intelligent  computer  program 
which  is  skilled  in  the  problem  domain,  algorithmic  anah-sis,  data  structure  selection, 
etc.  Some  success  has  been  achieved  (Green,  1977],  (Barstow,  1977^ 

(Manna  and  Waldinger,  1977}  But  a  realistic  appraisal  would  suggest  that  automatic 
programming  systems  will  not  be  practically  successful  without  the  development  of 
very  advanced  AI  techniques 

I  would  guess  that  no  truly  proficient  automatic  programming  system  can  exist 
which  is  not  capable  of  introspection  and  self- modification,  ie  of  skill  acquisition, 
improvement  and  development  Once  a  deep  theory  of  such  skills  is  developed,  it 
might  become  possible  to  build  automatic  programmers  which,  given  advice  from  an 
expert  human  programmer,  will  improve  their  skills  with  practice 

I  will  present  here  vet  a  third  approach,  called  the  Programmer's  Apprentice 
(Rich  and  Shrobe,  1976}  (Smith  and  Hewitt,  1975)  which  is  intended  to  serve  as  an 
intelligent  assistant  to  the  expert  programmer  during  a  program's  evolution.  The 
apprentice  has  only  limited  skills;  it  is  not  yet  expert  in  areas  of  program  design  or 
efficiency,  but  it  does  contain  a  large  body  of  knowledge  about  programs  and  fairly 
sophisticated  reasoning  capabilities. 


For  Complex  Program  Understanding 


6  The  Importance  of  Program  Understanding 

To  use  the  apprentice  one  should  not  be  required  to  provide  such  a  large  degree 
of  details  that  the  system  would  lose  all  practical  utility.  Instead  I  imagine  the 
programmer  providing  the  apprentice  with  approximately  the  same  volume  and  type  of 
information  that  is  now  supplied  as  background  documentation  and  in-line 
commentary.  Given  this  information  the  apprentice  should  be  able  toe 

(1)  Modularize  the  code  into  appropriate  segments  each  of  which  has  logical  coherence 
and  an  easily  described  behavior. 

(2)  Derive  an  explanation  of  how  the  behavior  of  the  segments  interact  to  achieve  the 
desired  goals  of  the  whole  program. 

(3)  Deduce  which  features  of  the  program  are  crucial  and  which  are  gratuitous. 

(4)  Relate  this  program  to  commonly  used  techniques  of  programming. 

(5)  Help  the  programmer  decide  whether  a  proposed  method  could,  in  fact,  achieves 
the  desired  goals  and  whether  its  sub-units  can  be  fitted  together  in  a  coherent 
manner. 

(5)  Detect  coding  errors  as  failures  of  the  written  code  to  correspond  to  the  design. 

(6)  Index  this  information  for  ease  of  use  in  program  explanation.  The  apprentice 
should  be  able  to  explain  a  program  in  high-level  human-like  terms. 

(7)  Reason  about  the  effect  of  proposed  modifications  to  the  code  without  having  to 
analyze  (he  entire  program  starting  from  scratch. 

Our  apprentice  system  is  set  apart  from  verification  systems  like  those  of 
(Deutsch,  197.U  (King,  19691,  (Igarasht,  et  aL,  1975]  by  its  central  focus  on  the 
evolutionary  character  of  the  program  design  process.  As  I  will  explain  later,  this  has 
led  me  away  from  a  concern  for  "prosing  programs  correct".  Instead,  I  have  been 
more  interested  in  building  a  system  which  will  interact  with  a  programmer  during  the 
period  of  design  evolution  and  which  can  converse  with  the  programmer  in  terms 
which  an  experienced  software  engineer  would  find  natural  and  familiar.  The 
apprentice's  goal  is  to  interact  with  the  programmer  to  develop  reasonable  designs 
which  meet  the  engineer's  criterion  of  “good  enough"  (as  opposed  the  mathematicians 
criterion  of  provably  correct).  The  apprentice  should  be  able  to  analyze  program 
designs  at  varying  degrees  of  detail  During  an  initial  interactive  session  it  should  be 
able  to  analyze  the  program  sufficiently  to  catch  obvious  bugs.  The  information 
obtained  by  this  analysis  should  be  saved  so  that  the  apprentice  can  help  assess  the 
effects  of  changes  which  the  programmer  might  want  to  make  in  the  future:  Finally, 
when  the  cost  and  time  is  merited  the  apprentice  should  be  able  to  conduct  a  more 
thorough  analysis  and  to  verify  properties  of  the  program. 


Dependency  Directed  Reasoning 


1.2  An  Imagined  Scenario  7 


Seotlon  1.2:  An  Imagined  Scenario 
An  Idealised  Example  of  Using  an  Apprentice 

A  typical  interaction  between  the  apprentice  and  a  programmer  building  an 
associative  retrieval  system  would  look  something  like  the  following  (Note:  as  usual  the 
use  of  English  dialogue  is  a  convenient  fiction  whose  only  purpose  is  to  make  the 
presentation  more  comprehensible.  Natural  language  understanding  and  generation  are 
beyond  the  scope  of  this  work.)  A  similar  scenario  was  presented  as  a  "wish  list"  in 
(Floyd,  1971);  at  the  end  of  the  scenario  I  will  indicate  how  much  of  my  wish  list  is 
met  by  the  current  system  and  its  foreseeable  development 


Programmer  I  want  to  make  an  associative  retrieval  system  which  will  be  something 
like  the  one  in  CONNIVER.  It  will  store  each  assertion  in  each  bucket  hashed  to  by 
one  of  its  leaf  nodes  I'm  going  to  start  by  coding  the  insert  routine  for  this  data¬ 
base.  Here's  the  code; 


(dafu«  Iniirt  (auartion  tabla  lift) 

(do  ((aittrt  mi'l  ion  (ctfr  auart)) 

( ’nda»  1  (plus  I  (fatal  indt«  {))) 

( car  -  pt  iprt  Ml)) 

((dull  anart ) ) 

(talq  car-anart  (car  auart)) 

( cond 

( (  ato»  car-auart ) 

(buckat- murt  array  (hath  car-auart  indat  Ufa)  auartion)) 

(t  (intart  car-auart  tabla  uta))))) 

Note;  The  code  above  is  written  in  MACLISP.  The  do  function  used 
above  is  an  iteration  construct.  The  first  s-expression  following  the  do  is  a 
set  of  clauses  controlling  the  iteration  Each  clause  begins  with  a  variable 
name  which  is  bound  locally  by  the  oa  If  there  is  an  s-expression 
following  the  variable  name  then  the  variable  is  initialized  to  have  tiie 
value  of  this  expression;  otherwise  the  variable  is  initialized  to  »iu  If  there 
is  a  third  s-expression  in  the  clause,  then  each  time  the  loop  is  repeated 
the  variable  is  assigned  the  value  of  this  expression  If  there  are  several 
loop  variables,  the  assignments  of  their  values  is  done  in  parallel. 


For  Complex  Program  Understanding 


8  The  Importance  of  Program  Understanding 


The  second  s-expression  following  the  do  is  a  list  of  s-expressions. 

The  first  of  these  is  a  termination  test  which  is  evaluated  at  the  beginning 
of  each  loop  repetition.  If  this  s-expression  evaluates  to  a  mm-mil  value, 
execution  of  the  loop  terminates.  The  following  s-expressions  in  the 
termination  clause  are  evaluated;  the  value  of  the  last  s-expression  is  the 
value  of  the  whole  do  expression. 

All  succeeding  s-expressions  are  the  body  of  the  loop  and  are  executed 
each  time  the  loop  is  repeated.  A  mtum  expression  in  the  body  of  the 
loop  terminates  loop  execution,  tirun  takes  one  argument  The  value  of 
this  expression  is  the  value  of  the  whole  loop  . 

Apprentice;  Tin  not  sure  I  understand  this  code  completely.  It  appears  that  yotfre 
doing  a  tree  traversal  through  the  assertion  and  filtering  for  the  terminal  nodes  which 
are  the  atoms.  But  I  don't  understand  what  the  purpose  of  mdcx  is,  nor  do  I 
understand  why  hash  takes  i*otx  as  an  argument  The  hash  I  know  about  takes  a 
single  object  and  an  array-range  and  computes  an  index  of  that  array. 

Programmer  In  this  system  hash  takes  two  arguments:  (I)  An  atom  which  appears  in 
the  assertion;  (2)  A  number,  called  iaoik,  which  indicates  the  position  of  the  atom  in 
the  assertion,  given  that  the  assertion  is  viewed  as  a  tree.  If  the  same  atom  appears 
twice  in  the  assertion  then  it  will  have  more  than  one  index;  this  allows  us  to 
distinguish  the  two  occurrences. 

Apprentice;  I  still  don’t  understand.  Explain  the  numbering  scheme  to  me  please. 

Programmer  The  idea  is  that  every  node,  except  for  the  root,  is  the  immediate  child 
of  some  other  node.  Number  the  root  node  1  and  then  each  child  is  numbered  as 
follows;  If  the  node  is  the  left  child  of  its  parent,  its  number  is  twice  its  parent's;  if 
it  is  the  right  child  of  its  parent  its  number  is  twice  its  parent's  plus  1. 

Apprentice:  The  way  you  wrote  the  code  when  you  make  the  recursive  call  for  the 
caa  (Le.  the  left  child  when  the  assertion  is  viewed  as  a  tree)  you  don't  pass  the  index 
of  the  current  node  as  an  argument,  so  it  starts  counting  from  1  again 

Programmer  You're  right  Til  have  to  change  that,  that  means  that  the  routine  will 
have  to  take  the  index  as  an  argument,  which  means  that  there  should  be  an 
initialization  function  to  call  from  the  outside  Til  recode  it: 


Dependency  Directed  Reasoning 


1.2  An  Imagined  Scenario  9 


(tftfun  insirt-1  (uurtion  t«felf  mt  mdti ) 

(do  ( (  onort  onortton  (cdr  Hurt)) 

( tndoi  indo>  (»1ui  1  (tHkai  indoi  I))) 

(cor-oMtrt  Ml)) 

((■oil  otiort)) 

(tolo  cor -otiort  (cor  otiort)) 

( cond 

((oto<a  tor-oiiort) 

(buckit- intort  orroy  (both  cor-oitort  Indoi  ttio) 
onortton)) 

(t  ( intort •  I  cor • otiort  toklo  tlto  (tnooi  Indoi  *)))))) 

(dolun  iniort  (onortion  toblo  two) 

(intort  l  ottortion  toblo  two  1)) 

Apprentice;  One  further  question  I  assume,  given  its  name,  that  sucket-irscrt  inserts 
its  third  argument  into  the  bucket  indexed  by  its  second  argument  You're  inserting 
assert ioa  on  each  call,  which  is  not  the  same  object  all  the  time.  My  understanding  of 
this  type  of  hash  table  is  that  an  assertion  should  be  in  each  bucket  that  its  terminals 
hash  to.  But  when  you  make  the  recursive  call  for  the  left  nodes,  assert ior  becomes 
the  current  sub-tree  of  the  assertion 

Programmer  You’re  right  again  I  should  make  the  whole  assertion  an  argument  to 
the  procedure  as  welL  Can  you  make  that  change? 

Apprentice;  Yes,  consider  it  done.  How  about  the  iuckt-irseri  routine;  would  you  like 
to  design  that  now? 

Programmer  OK.  In  this  design,  buckets  will  be  ordered  lists  of  assertions,  rucrit- 
irsert  will  fetch  the  bucket  indexed  by  its  second  argument  and  then  do  a  ust-irsert. 
This  will  be  the  standard  frailing-poinitr-mumtration  which  tests  each  item  to  see  if 
its  uriquc  to(«Tirtt«  is  larger  than  that  of  the  assertion  Til  use  LISPs  kakruw  to 
implement  the  unique  identifier.  When  the  first  larger  member  of  the  list  is  found, 
I'll  splice  in  the  new  assertion 

Apprentice:  I  understand  the  design,  but  there  is  a  mistake.  If  the  new  assertion  is 
smaller  than  any  element,  then  it  can't  be  spliced  in 

Programmer  In  that  case,  cors  it  onto  the  front  of  the  list 


For  Complex  Program  Understanding 


10  The  Importance  of  Program  Understanding 


Apprentice:  That  won't  work  either,  since  you  want  the  assertion  to  be  a  member  of 
the  bucket  indexed  by  the  hash  you've  calculated  But  if  you  just  cons  this  new 
assertion  onto  the  front  of  the  list  it  won't  be  in  the  list  pointed  to  by  the  table. 
The  item  of  the  table  in  that  position  will  still  be  the  old  list  which  is  the  coa  of  the 
one  which  contains  the  new  item. 

Programmer  OK.  If  this  case  is  reached  we'll  not  only  cons  the  element  on,  but  we?U 
also  store  this  new  list  in  the  appropriate  position  of  the  table. 

Apprentice:  The  design  now  is  correct 

Programmer  Here's  the  code: 


( tf • r un  bucket- Ibiprt  (*rr«y  inPvi  1t«a) 

(d»  ( ( current  (prrpycpn  array  ind»i)(cdr  currant)) 

(prtvioui  nil  currant) 

(unipu«-i<  (Batman  itaa))) 

((null  currant) 

(c and  ((null  prpvlpui) 

(array- Mart  array  Indti  (lilt  Itaa))) 

(t  (rplacd  praviaui  (lilt  Itaa))))) 

(cand 

((>  (natnua  (car  currant))  unipua-td) 

(cand  ((null  pravlaui) 

(arrayltara  array  Indaa  (cant  Itaa  currant))) 

(t  (rplacd  pravlaui  (cant  itaa  currant)))) 

(return  t ) )  > ) ) 

Programmer  Now  I'd  like  to  do  the  io<*u*.  I  just  realized  that  loom*  is  very  similar  to 
i*sr«r.  It  also  does  a  tree- traversal  through  the  list  structure,  calculating  indices  and 
fetching  buckets.  I  think  that  this  ought  to  be  modularized  so  that  there  will  only  be 
one  place  where  this  indexing  is  done.  I  guess  what  Td  like  to  do  is  write  a 
subroutine  called  i»ot*  which  takes  an  assertion,  the  table  and  the  size  and  the  returns 
the  list  of  buckets  which  this  assertion  indexes  to 

Apprentice:  The  current  insht  routine  is  a  tree-traversal  which  produces  a  sequence 
of  indices  which  are  handed  to  bicut- irsirt.  Each  occurrence  of  bucut-irscrt  fetches 
the  bucket.  You  can  re-arra  .  things  so  that  itotx  will  do  the  tru- traversal, 
producing  the  sequence  of  indices;  each  of  these  is  fed  to  bucut-utcm  to  get  the 
bucket,  and  each  of  these  can  then  be  fed  to  a  list-accvmulator  to  produce  a  list  of 


Dependency  Directed  Reasoning 


1.2  An  Imagined  Scenario  11 


buckets.  Here's  the  code: 


(dofun  indoi -I  (uurtion  tiMi  mi  indoi  buckot-ltit) 

(do  ((otiort  oiiortion  (cdr  ouort)) 

(indoi  indoi  (plu«  t  llwti  Indoi  2))) 

(cor-onort  nil)) 

((Hull  otiort)  buckot-ltit) 

(totq  cor • oi tort  (cor  otiort)) 

(cond 

((•to*  cor-onort) 

(••to  buckot-liit 

(coni  ( orroycol 1  orroy 

(hub  cor-onort  Indoi  llto)) 
buckot- tut ))) 

(t  ( iotq  buckot-liit 

( indoi  - 1  cor-onort  toblo  llto 

(tiaoi  indoi  2)  buckot-ltit)))))) 

( do fun  tndoa  (oiiortion  toblo  llto) 

(  indoi  I  oiiortion  toblo  11/0  )  nil)) 


Programmer  Can  you  fix  losm  to  call  irocx,  rather  than  doing  the  indexing  itself? 

Apprentice:  It  would  seem  so.  Since  mou  produces  a  list  of  all  the  correct  buckets,  if 
I  do  a  standard  list-enumeration  of  these  Til  produce  a  sequence  of  buckets  to  be 
handed  one  at  a  time  to  sucmt-ipsipt.  So  the  general  plan  is  still  the  same.  But, 
aucKit-iHstiT  will  have  to  be  changed  since  it  now  expects  its  input  to  be  an  array 
index  not  a  bucket 

Programmer  Change  it  to  make  it  expect  the  bucket 

Apprentice:  Now  you've  got  problems.  In  the  special  case  where  the  item  being 
inserted  is  smaller  than  any  presently  in  the  bucket,  we  had  to  store  the  new  bucket 
back  into  the  table.  But  iucut-ihsut  in  this  new  version  doesn't  have  the  index  of 
the  bucket 

Programmer  Oh  well'  Actually,  I  wanted  to  change  the  representation  of  the  buckets 
a  little  anyhow.  I  want  the  first  item  of  the  bucket  to  be  a  count  field  and  then  the 
list  of  members  to  be  the  rest  of  the  list  I  think  if  you'll  check,  it  will  turn  out  that 
this  removes  the  problem  since  the  item  to  be  inserted  will  always  have  to  come  after 


For  Complex  Program  Understanding 


1 2  The  Important*  of  Program  Understanding 

the  count  field;  you  can  always  do  this  by  side-effect  The  iist-i«sc*t  routine  will 
have  to  lx*  initialized  differently,  starting  with  the  pmvious  pointer  pointing  at  the 
whole  bucket  including  ihe  count  field  and  the  cu*«cmt  pointer  pointing  at  the  cot  of 
this  list  which  is  the  part  with  all  the  items  in  it 

Apprentice:  Yes,  that  will  work,  except  that  I  can't  prose  that  the  table  will  always 
have  such  a  structure  in  each  slot  Arrays  are  initialized  to  nil,  not  buckets  as  you 
just  described  them. 

Programmer:  I  will  write  an  initialization  routine  later  which  will  set  up  the  table  to 
have  empty  buckets  in  each  slot 

Apprentice:  What  should  be  in  the  count  Held  of  the  bucket?  There  is  no  code  to 
maintain  it  yet. 

...  and  so  on 

(Note:  this  dialogue  was  extracted  from  a  transcript  of  several  coding  sessions} 

1  his  scenario  clearly  is  more  ambitious  than  anything  currently  implemented.  In 
particular,  the  language  and  discourse  expertise  implicit  in  this  scenario  are  not  even 
part  of  im  current  research  goals  However,  the  basic  facilities  in  this  system  are 
now  under  development.  The  apprentice  system  I  have  shown  consists  of  four  main 
facilities  First,  the  programmer  proposes  designs  which  the  apprentice  checks  for 
logical  consistency.  In  chapter  7  I  show  an  example  of  such  an  analysis  which  was 
conducted  b\  the  first  implementation  of  REASON.  In  the  case  where  a  design  is 
incorrect,  however,  the  scenario  shows  REASON  framing  high  level  descriptions  of  the 
problem.  I  his  facility  is  not  yet  developed. 

The  second  facilits  shown  in  the  scenario  is  the  ability  to  determine  whether  the 
actual  code  corresponds  to  the  design.  This  facility  is  not  part  of  REASON  at  all;  in 
(Rich  A  Shrobr,  |076|  we  described  an  initial  design  of  a  recognition  system  which 
could  conduct  such  an  analysis 

The  third  facility  shown  in  the  scenario  is  the  use  of  pre-proven  fragments  to 
anahre  a  program’s  behavior.  This  is  coupled  with  the  use  of  temporal  collections  to 
segment  the  system  into  these  fragments.  Chapters  8  and  9  present  a  detailed 
formalism  for  this  kind  of  analysis.  (Waters,  1978)  reports  on  an  implemented  system 
which  conducts  such  an  analysis  for  mathematical  FORTRAN  programs  The  use  of 


Dependency  Directed  Reasoning 


1.2  An  Imagined  Scenario  13 


plan  fragments  to  guide  the  logical  analysis  is  not  yet  implemented  in  my  system. 

The  final  facility  is  shown  in  reasoning  about  program  modifications.  This  is 
discussed  in  chapter  13.  My  current  system  is  only  part  of  the  way  to  being  able  to 
handle  such  reasoning.  As  the  system  makes  deductions  it  records  the  logical 
dependencies  used  in  each  step  Furthermore,  once  it  has  conducted  a  complete  proof 
it  analyzes  these  into  a  summarized  form  called  purpose  links.  These,  together  with 
the  complete  proof  are  the  basis  for  further  reasoning  about  modification.  However, 
the  techniques  for  using  these  representations  are  still  being  developed 

Throughout  the  scenario,  the  apprentice  exhibits  considerable  expertise  about  data 
structures  and  side  effects.  In  Chapters  11  and  12  I  discuss  the  techniques  used  to 
conduct  such  an  analysis.  The  basic  framework  shown  in  these  chapters  is  now  well 
worked  out,  but  is  still  in  the  process  of  implementation.  The  earlier  version  of 
REASON  could  conduct  similar  side  effect  analysis,  but  was  far  less  robust  and 
flexible. 

In  general,  the  reader  should  remember  that  the  above  scenario  is  a  wish  list  and 
that  the  remainder  of  this  thesis  is  a  progress  report  on  the  research  required  to 
implement  the  facilities  of  the  wish  list 


For  Complex  Program  Understanding 


14  The  Importance  of  Program  Understanding 

Section  1.3:  The  Research  Content  of  This  Thesis 

The  apprentice  represents  a  practical,  medium  term  research  goal  in  which  many 
of  the  issues  of  program  understanding  can  be  explored.  The  representation  of 
programs,  the  ability  to  understand  the  underlying  logic  of  a  program  and  to  reason 
about  the  effect  of  program  modifications  are  crucial  prerequisites  to  the  development 
of  a  self-conscious  system  capable  of  serious  skill  acquisition  I  will  develop  in  this 
thesis  a  representation  called  plans  for  the  logical  structure  of  programs  and  a 
reasoning  system  which  follows  a  discipline  of  explicit  representation  of  its  control 
strategies.  This  will  allow  the  system  to  examine  its  own  control  state  and  to  chose 
what  to  do  next  based  on  that  examination 

Throughout  the  scenario  above  we  saw  the  apprentice  and  the  programmer 
referring  to  a  shared  body  of  knowledge  about  standard  program  structure.  The 
apprentice  talked  about  tree- traversal,  list-accumulation,  filtering  certain  elements. 
The  middle  section  of  this  document  will  develop  the  plan  formalism  for  representing 
such  notions  and  will  present  examples  of  some  useful  standard  plans.  Influenced  by 
the  work  in  [Waters,  1977)  this  formalism  has  teen  extended  from  that  in 
[Rich  and  Shrobe,  1976)  to  allow  plan  fragments  which  produce  and  consume 
sequences  of  objects  distributed  in  time.  The  importance  of  this  notion  can  be  seen 
in  the  scenario  above  where  the  apprentice  develops  the  code  for  mot*  from  that  for 
i«st«r.  Having  done  this  the  apprentice  notices  that  i»orx  produces  a  list  of  buckets 
which  is  the  same  sequence  of  values  as  that  produced  by  the  internal  mt-TiMvmsM. 
and  buck!  t •  t itch  fragments  which  were  internal  to  mots  When  grouped  appropriately, 
a  call  to  i»ot*  followed  by  a  standard  ust-tauMtutioa  is  identical  to  the  internal 
fragments  of  iwt*.  In  the  last  section  of  the  thesis  I  will  consider  how  to  analyze  the 
effects  of  such  program  modifications  and  how  to  maintain  consistency  as  procedures 
evolve  through  the  design  process.  I  will  show  how  the  analysis  of  programs  into  plan 
fragments  can  greatly  reduce  the  complexity  of  understanding  modifications 

The  programs  which  I  present  in  this  thesis  involve  side-effects  on  complex  and 
shared  structures  Analyzing  this  kind  of  program  is  a  very  tedious  process  which 
people  simplify  using  many  heuristics  In  chapters  11  and  12  I  will  show  how  the 
reasoning  system  can  analyze  side-effects  at  varying  levels  of  detail  which  correspond 
to  the  levels  which  people  seem  to  use.  This  will  allow  the  system  to  develop  an 
understanding  of  what  the  program  is  intended  to  do,  before  it  is  forced  to  determine 
what  it  actually  does  in  all  the  possible  "screw-ball"  cases  The  system  is  capable  of 
going  hack  and  being  more  careful  in  its  analysis,  using  the  information  from  the  firtt 


Dependency  Directed  Reasoning 


1.3  The  Research  Content  of  This  Thesis  15 


analysis  to  guide  the  second.  The  use  of  dependency  information  and  non-monotonic 
logic  [Doyle,  1978]  to  conduct  side-effect  analysis  is  unique  to  this  thesis. 

The  work  reported  on  here  is  part  of  an  ongoing  project  to  develop  a  working 
Programmer's  Apprentice.  Charles  Rich  and  I  began  in  this  work  in  1974;  many  of 
the  ideas  presented  here  were  developed  jointly.  It  is  hard  to  identify  all  those  places 
in  this  thesis  which  have  been  influenced  by  Rich's  [Rich,  1977]  ideas.  The 
Programmer's  Apprentice  project  was  later  joined  by  Richard  Waters  who  used  many 
of  our  initial  ideas  to  analyze  numerical  FORTRAN  programs  [Waters,  1976]  Waters 
found  it  very  convenient  to  think  of  loops  as  being  built  up  by  a  series  of  Plan 
Building  Methods.  In  his  view  a  loop  consists  of  a  nucleus  which  produces  a  sequence 
of  values.  Embedded  in  this  nucleus  are  various  augmentations  which  consume  the 
sequence  of  values  produced  by  the  nucleus  Augmentations  can  be  made  to  operate 
on  a  restricted  sequence  by  including  a  pher  to  eliminate  certain  elements  from 
consideration. 

Waters'  ideas  find  their  way  into  this  thesis  as  a  more  general  notion  of 

temporal  collections  which  may  serve  as  the  outputs  and  inputs  of  segments  Any 
recursive  program  can  be  viewed  in  at  least  two  ways,  involving  two  distinct 
segmentations  One  of  these  called  the  temporal-viewpoint  involves  a  cascade  of 
segments  passing  such  temporal  collections;  the  other,  called  the  surface- viewpoint,  is 
simply  an  aggregation  of  the  code  into  modules  Some  features  of  the  program  are 
made  clear  by  the  temporal  viewpoint  while  others  are  seen  more  easily  in  the  surface 
viewpoint. 

The  program  REASON  reported  on  in  this  thesis  has  gone  through  several 

incarnations  As  part  of  my  earlier  work  with  Rich,  an  initial  version  of  REASON 
was  designed  and  reported  on  in  our  Master's  Thesis  [Rich  and  Shrobe,  1976]  That 
version  was  completely  coded  and  worked  as  reported.  However,  the  earlier  version 
was  rather  cumbersome  and  ill-suited  to  the  recognition  tasks  for  which  it  was 
intended.  During  the  later  part  of  the  development  period  of  the  first  version  of 
REASON  [Stallman  and  Sussman,  1976]  introduced  the  use  of  dependency  networks  in 
their  electronic  circuit  analysis  program  EL  The  dependency  network  was  then 

extended  and  built  into  a  separate  package  called  the  Truth  Maintenance  System  in 
[Doyle,  1978]  Although  the  first  version  of  REASON  maintained  dependencies,  it  did 
not  have  a  truth  maintenance  system  which  used  these  to  any  advantage.  Doyle  and 
others,  however,  built  a  simple  problem  solving  language,  called  AMORD 

[DeKleer,  et  aL,  1977]  which  did  interact  with  the  TMS 


For  Complex  Program  Understanding 


16 


The  Importance  of  Program  Understanding 


The  current  version  of  REASON  is  being  developed  as  a  program  written  in  a 
variant  of  AMORD.  I  have  had  enough  experience  with  the  first  implementation  and 
the  partially  completed  new  version  to  report  on  this  work  with  confidence. 

In  the  chapters  14  and  15,  I  will  attempt  to  evaluate  this  admittedly  partial  work 
and  to  compare  it  to  other,  more  developed  systems  for  program  understanding.  I 
hope  that  the  reader  will  find  the  tedium  of  working  through  this  document  rewarded 
by  at  least  an  occasional  glimpse  of  something  promising 


Dependency  Directed  Reasoning 


2  An  Engineering  Theory  of  Evolutionary  Design  17 


Chapter  2:  An  Engineering  Theory  of  Evolutionary  Design 

The  scenario  of  the  last  chapter  emphasizes  the  evolutionary  character  of  the 
design  process.  I  believe  that  the  key  to  supporting  such  an  evolutionary  interaction  is 
the  development  of  powerful  techniques  for  program  analysis.  Analysis  is  the  process 
of  decomposing  a  program  into  coherent  modules  such  that  the  behavior  of  the  whole 
artifact  can  be  understood  in  terms  of  the  behavior  of  its  parts. 

The  dominant  form  of  analysis  in  current  computer  science  research  is  the 
Floyd-Hoare  verification  techniques,  especially  as  developed  by  [Igarashi,  et 
al,  1 97 In  this  technique  the  basic  unit  of  decomposition  is  the  programming 
language  primitive  whose  behavior  is  specified  by  axioms  written  in  Hoare's  logic. 
Each  such  axiom  provides  a  method  for  transforming  a  post-condition  of  a 
programming  language  primitive  into  a  pre-condition  (or  vice-versa).  Verification 
usually  proceeds  by  stating  a  pre-condition  and  a  post-condition  for  an  entire  program. 
These  are  combined  using  a  process  known  as  Verification  Condition  Generation 
(VCG).  VCG  passes  the  post -condition  back  over  each  language  primitive  of  the 
program  in  turn;  the  statement  arrived  at  when  the  modified  predicate  is  finally  passed 
over  the  first  primitive  of  the  program  has  the  property  that  it  must  be  true  on 
program  entrance  if  the  original  post-condition  is  to  be  true  on  program  exit  Finally, 
an  implication  is  formed  from  the  pre-condition  of  the  whole  program  and  this  new 
statement.  If  this  implication  can  be  proven,  then  the  program  must  exhibit  the 
behavior  specified  by  the  pre-  and  post-conditions 

Given  the  attention  paid  to  verification  techniques  in  recent  years,  one  might 
think  that  they  are  sufficiently  powerful  to  help  manage  the  problems  of  evolutionary 
design.  I  feel  that  such  a  conclusion  is  unwarranted.  Program  proving  techniques  will 
play  an  important,  but  limited  role  in  supporting  incremental  design  and  evolutionary 
programming  It  is  my  feeling  that  the  techniques  now  in  existence  have  been 
designed  with  a  particular  kind  of  program  -  namely  algorithms  -  in  mind  and  that 
there  are  a  large  number  of  distinctions  between  such  programs  and  the  software 
systems  which  I  am  interested  in.  This  difference  of  concern  has  lead  to  a  difference 
in  perspective  and  methodology  which  underlies  this  entire  document 


1.  Much  of  this  material  was  originally  written  as  part  of  a  proposal  to  the  National 
Science  Foundation.  I  acknowledge  and  appreciate  the  extensive  editing  by  Charles 
Rich  and  Richard  Waters  which  went  into  those  sections. 


For  Complex  Program  Understanding 


18 


An  Engineering  Theory  of  Evolutionary  Design 


Section  2.1:  Type  of  Programs  —  Algorithms  vs.  Systems 

We  can  identify  two  different  kinds  of  programs:  algorithms  and  systems  which 
differ  along  a  number  of  dimensions.  Each  kind  of  program  is  valid  and  important  to 
computer  science,  but  they  present  different  demands  and  requirements.  I  believe  that 
program  proving  is  most  useful  and  necessary  for  the  domain  of  algorithms  while  the 
techniques  I  introduce  in  this  document  are  more  useful  for  the  analysis  of  systems. 

Typically,  an  algorithm  is  a  relatively  short  program  which  can  be  precisely  and 
concisely  specified.  Specifications  for  an  algorithms  often  are  much  shorter  than  the 
program  text.  For  example,  the  Euclidean  GCD  algorithm  occupies  about  7  lines  of 
code  in  any  recursive  language;  its  specification  is  of  about  the  same  length:  the 
answer  divides  both  inputs  and  it  also  is  divisible  by  any  other  common  divisor  of  the 
inputs.  The  Knuth-Morris-Pratt  or  the  Boyer-Moore  string  matching  algorithms 
occupy  roughly  100  lines  of  code  but  have  a  very  short  specification:  the  answer 
returned  is  the  position  of  the  first  string  in  the  text  which  matches  the  input  pattera 

A  second  feature  characterizing  algorithms  is  that  they  exhibit  a  clever  underlying 
logic  which  requires  proof.  The  intricacies  of  either  of  the  string  matchers  mentioned 
above  would  lead  one  to  doubt  whether  they  worked  unless  a  rigorous  proof  were 
presented.  Indeed,  the  cleverness  underlying  any  particular  algorithm  makes  its  code 
quite  different  than  of  other  algorithms.  Thus,  one  finds  few  familiar  cliche's  in  the 
code.  Instead  one  must  work  hard  to  find  an  explanation  for  the  function  of  each 
line.  In  addition,  since  algorithms  are  meant  to  be  used  as  components  of  other 
programs  it  is  crucial  that  they  be  known  to  be  correct;  a  single  mistake  could  have 
thousands  of  repercussions. 

Algorithms  are  built  to  satisfy  a  precisely  stated  specification  which  has  general 
utility.  The  specification  is  not  subject  to  change  or  reinterpretation.  An  algorithm 
is  not  an  evolutionary  program  Euclid's  and  Pingala's  algorithms  have  survived  in 
essentially  unchanged  form  for  more  than  a  millennium 

A  final  point  about  an  algorithm  is  that  it  frequently  represents  an  extremely 
optimized  method  for  achieving  a  very  common  task.  This  optimization  is  achieved 
through  clever  and  often  obscure  techniques.  But  in  the  case  of  an  algorithm  this  is 
allowable  and  even  desirable  The  algorithm  is  published  with  an  explanation;  it  is  not 
intended  to  be  modified  and  therefore  intricacy  is  appropriate  if  it  leads  to  an 
improvement  over  previous  techniques. 


Dependency  Directed  Reasoning 


11  Type  of  Programs  --  Algorithms  vs.  Systems  19 

One  might  ho  inclined  to  think  of  a  system  as  nothing  more  than  a  large 
collection  of  algorithms.  However,  the  description  above  should  make  it  clear  that  the 
whole  is  more  than  just  the  sum  of  its  parts.  Each  of  the  characterizing  aspects  of 
algorithms  are  in  fact  untrue  of  systems.  Software  systems  are  large  programs  with 
specifications  and  other  related  documentation  exceeding  the  size  of  the  code  by  an 
order  of  magnitude.  The  specifications  arc  not  crisp,  well  defined,  or  permanent 
Indeed,  they  often  are  tied  to  social  and  institutional  practices  which  change  for 
reasons  having  nothing  to  do  with  computation.  Financial  and  management  systems 
are  dependent  on  tax  codes,  business  practices,  etc.  Military  related  systems  depend 
on  the  arms  technology  and  defense  strategies  of  the  world  powers.  When  specifying 
systems  which  function  within  such  a  nexus,  it  is  impossible  to  state  precisely  what  is 
to  be  dMie,  one  instead  states  some  criteria  which  must  be  met  and  others  which  are 
suggestive  of  less  crucial  but  desired  behavior.  In  any  event  these  criteria  change,  and 
the  system  is  forced  to  evolve  to  meet  the  new  criteria 

Although  mush  can  be  done  with  methodologies  for  requirements  definition, 
experience  in  ilie  military  and  in  industry  suggests  that  the  effect  of  such  methodology 
has  a  significant  but  limited  impact.  It  is  an  empirically  undeniable  fact  that 
specifications  are  never  completely  elaborated  and  that  they  evolve  with  the  life  of  a 
system.  Even  m  the  most  careful  of  system  designs  the  product  passes  through  many 
iterations  of  design  and  coding  before  something  acceptable  is  developed.  In  the  next 
section  I  will  present  an  argument  for  why  this  must  be  the  case. 

Systems  also  differ  from  algorithms  in  the  degree  to  which  they  involve  clever 
and  intricate  logic.  Typically  a  system  of  programs  is  made  up  of  a  large  number  of 
relatively  small  modules,  each  of  which  involves  routine  and  mundane  code.  There  is 
a  vocabulary  of  ».  Ik  he's  out  of  which  such  code  is  built  and  the  experienced 
programmer  can  analyze  such  routine  coding  patterns  by  inspection.  Occasionally 
something  idiosyncratic  is  thrown  in  but  even  these  arc  usually  simple  to  understand. 
Even  at  higher  levels  of  the  system,  sub-modules  are  combined  in  routine  ways. 
Verification  of  such  modules  would  be  conducted  mainly  as  a  method  of  isolating 
coding  mistakes  such  as  fencepost  errors  and  typos. 

The  complexity  of  a  system  does  not  primarily  arise  from  the  use  of  locally 
intricate  strategics  but  rather  from  the  sheer  number  of  interactions  between  modules. 
These  make  it  difficult  to  assess  the  effect  of  a  proposed  change  to  the  system  since 
each  module  may  enter  into  purposeful  relationships  with  many  others.  Systems  tend 
to  reach  a  point  where  the  volume  of  these  interactions  overwhelm  unaided  human 


For  Complex  Program  Understanding 


20  An  Engineering  Theory  of  Evolutionary  Design 

abilities  to  manage  the  complexity.  Once  this  point  is  reached  changes  to  the  system 
produce  more  harm  than  good.  Rather  than  continuing  to  evolve,  the  system  it 
frozen  and  a  new  system  is  commissioned 

In  summary,  we  may  distinguish  between  algorithms  and  systems  along  two  major 
dimensions.  Algorithms  are  permanent,  almost  mathematical  objects,  which  are  not 
subject  to  frequent  modification  of  either  code  or  specification.  Systems  are 
impermanent,  evolutionary  programs  of  little  mathematical  interest  The  complexity  of 
an  algorithm  is  largely  due  to  the  use  of  locally  intricate  and  clever  strategies;  the 
complexity  of  a  system  is  due  primarily  to  the  sheer  volume  and  number  of 
interactions  between  modules. 

These  distinctions  lead  one  to  see  the  need  for  different  kinds  of  automated 
design  tools.  The  designer  of  algorithms  needs  the  use  of  proof  checkers,  theorem 
provers  and  verification  systems.  While  these  serve  a  useful  role  for  the  systems 
designer  as  well,  they  are  not  his  bread  and  butter.  Instead  he  needs  tools  to  help 
him  modify  current  designs  to  meet  incrementally  new  requirements.  Given  the  sire 
of  a  software  system,  one  cannot  tolerate  the  delay  and  expense  of  a  completely  new 
analysis  every  time  such  an  evolutionary  modification  is  desired.  One  instead  requires 
incremental  processing  in  which  a  small  change  in  the  design  should  require  only  a 
small  amount  of  reprocessing  to  achieve  an  adequate  analysis. 

Even  if  a  program  can  be  proven  correct,  and  even  if  this  can  be  done  in  an 
incremental  fashion,  there  is  still  a  problem  Tite  sheer  volume  of  information 
developed  during  the  proof  of  a  software  system  renders  the  information  useless  unless 
the  analysis  structures  the  information  so  that  it  is  comprehensible  to  a  human.  When 
a  system  designer  sets  out  to  make  a  modification,  only  a  tiny  fraction  of  the 
information  is  actually  relevant;  the  system  designer  needs  tools  which  can  produce 
just  this  information  and  no  more  My  work  is  directed  towards  the  production  of 
such  tools. 


8ection  2.2:  What  Characterizes  Evolutionary  Change? 

As  we  saw  in  the  scenario  of  the  last  chapter,  the  user  repeatedly  proposes 
designs  and  then  debugs  these  designs  until  he  is  convinced  that  they  achieve  the 
desired  goal.  In  some  cases  he  even  goes  so  far  as  to  reorganize  the  program, 
breaking  down  some  module  boundaries  and  erecting  new  ones  in  their  stead.  For 


Dependency  Directed  Reasoning 


12  What  Characterizes  Evolutionary  Change?  21 


example,  the  code  for  indexing  an  assertion  is  extracted  from  the  inscit  code,  modified 
and  then  made  into  a  module  which  is  called  from  the  ioo*ur  routine.  At  another 
point  the  user  decides  to  change  the  structure  of  the  incurs. 

In  each  of  these  cases  the  proposed  modifications  have  ramifications  which  reach 
beyond  the  boundaries  of  any  modules  apparent  in  the  code,  yet  the  programmers 
clearly  thinks  of  them  as  incremental  or  evolutionary  changes.  I  wish  to  contrast 
these  evolutionary  changes  with  the  situation  in  which  the  programmer  cannot 
accommodate  whatever  change  he  is  considering  within  his  current  conceptualization  of 
the  system  and  so  redesigns  "from  scratch".  In  an  evolutionary  change,  the  main 
parameters  of  the  current  design  are  left  as  they  were;  only  some  small  set  of  changes 
is  needed  to  achieve  the  desired  goaL  In  redesign,  the  entire  structure  of  a  program 
is  blocked  out  anew. 

How  can  we  fell  which  of  these  categories  a  change  will  fall  in?  This  is  a  key 
question  since  I  want  my  system  to  be  prepared  to  handle  those  changes  which  an 
experienced  programmer  will  think  of  as  evolutionary.  The  received  wisdom  in 
programming  methodology  is  the  principle  of  the  modularization.  This  principle 
(which  is  the  computer  science  version  of  "a  place  for  everything  and  everything  in  its 
place")  suggests  that  if  module  boundaries  are  carefully  arranged  so  as  to  localize  each 
design  choice  to  a  single  module  then  all  evolutionary  changes  can  be  handled  by  local 
changes  to  a  small  number  of  modules.  However,  there  are  clearly  evolutionary 
changes  which  do  not  fit  into  this  paradigm.  As  I  have  already  mentioned,  the 
scenario  contains  examples  of  the  programmer  making  modifications  in  which  the 
module  boundaries  are  rearranged,  yet  these  changes  are  clearly  thought  of  as 
evolutionary. 

The  principle  of  modularization  suggests  that  evolutionary  changes  are  those 
which  are  local  to  a  module.  Although,  I  believe  that  this  notion  is  overly  rigid,  I  do 
believe  that  the  notion  of  locality  within  a  decomposition  is  the  crucial  idea  which 
characterizes  those  changes  which  can  be  treated  as  evolutionary  modifications.  In  the 
next  several  sections  I  will  develop  the  following  thesis:  Engineering  analysis  consists  of 
the  use  of  partially  accurate  models  to  allow  a  system  to  be  decomposed  into  multiple, 
overlapping,  tangled  hierarchies.  A  modification  will  be  perceived  as  evolutionary 
if  there  is  at  least  one  decomposition  such  that  within  Its  segmentation 
structure  the  effect  of  the  modification  appears  local. 


For  Complex  Program  Understanding 


22  An  Engineering  Theory  of  Evolutionary  Design 

Section  2.3:  Why  Is  Evolutionary  Design  Necessary? 

Why  don't  we  just  design  correct  programs  to  begin  with  and  dispense  with  the 
expense  of  design  iterations,  debugging,  and  evolution?  There  are  those,  for  example 
(Dijkstra,  1976]  who  believe  that  such  a  fault-free  methodology  is  both  desirable  and 
possible.  In  this  view,  one  would  start  with  specifications  for  a  program's  behavior 
and  refine  these  in  a  top-down  step-wise  manner  until  a  correct  program  had  been 
reached.  It  is  claimed  that  by  carefully  stating  the  invariants  (for  example  on  loops) 
before  they  are  coded,  one  can  assure  a  high  degree  of  reliability  for  the  code 
produced.  This  may  be  summarized  by  saying  that  one  should  have  a  proof  of 
correctness  in  mind  when  one  begins  to  code. 

As  far  as  this  goes  it  presents  little  to  argue  with;  however  the  methodology 
provides  little  concrete  guidance  as  to  how  one  should  develop  the  design  and  proof  of 
correctness  to  begin  with.  I  suspect  that  if  this  approach  is  suitable  at  all,  it  is  only 
useful  for  the  creation  and  implementation  of  algorithms.  As  I  have  observed,  the 
process  of  algorithm  development  is  quite  different  than  that  used  for  the  design  of 
large  software  systems.  Algorithms  are  the  result  of  months  (or  years)  or  research; 
when  a  researcher  has  the  insight  for  a  new  algorithm,  he  can  then  proceed  through  a 
top  down  design  process  in  which  his  insight  is  elaborated  into  a  design  for  the 
coding.  This  can  and  should  include  specifications  for  the  sub-modules  of  the 
algorithm.  Such  careful  specification  and  elaboration  of  the  algorithm's  design  can 
then  lead  to  a  correct  or  nearly  correct  coding  of  the  program 

The  design  and  construction  of  large  software  systems  is  quite  different  I  have 
already  observed  that  systems  are  evolutionary  by  nature.  One  reason  for  this  is 
external;  the  design  of  a  system  often  depends  on  social  and  institutional  practices 
which  change  quite  frequently.  However,  there  is  also  an  internal,  cognitive  reason 
why  systems  are  designed  incrementally,  namely  that  the  cognitive  complexity  of  the 
task  allows  no  other  approach.  The  designing  of  a  software  system  is,  in  my  view,  a 
form  of  problem  solving  not  very  different  from  that  used  in  a  conventional 
engineering  disciplines  such  as  electrical  engineering  or  even  in  common  sense 
reasoning.  The  overriding  goal  of  such  forms  of  reasoning  is  to  manage  and  reduce 
the  complexity  of  the  design  task  to  the  point  where  human  cognitive  powers  are 
adequate  to  produce  a  reasonable  solution  Much  of  Artificial  Intelligence  research  on 
problem  solving  has  consisted  of  the  development  of  paradigms  which  account  for  this 
type  of  reasoning  These  center  around  the  related  ideas  of  decomposition,  modeling, 
and  debugging  as  intrinsic  parts  of  the  planning  process. 


Dependency  Directed  Reasoning 


2.3  Why  It  Evolutionary  Design  Necessary?  23 


Common  sense  reasoning  and  engineering  problem  solving  share  a  need  to  limit 
the  complexity  of  the  planning  space.  In  both  these  domains  if  all  possibly  relevant 
details  were  to  be  considered  at  once  they  would  overwhelm  human  cognitive  capacity. 
Thus,  rather  than  trying  to  guarantee  a  perfect  answer  from  the  start,  one  works  for 
an  answer  which  is  close  enough  and  then  modifies  this  to  fit  the  actual  needs.  If 
one  does  not  take  this  approach  but  rather  insists  on  perfection  from  the  start,  the 
planning  process  would  stall  out  at  the  first  step 

The  goal  of  a  problem  solver  is  to  piece  together  a  collection  of  actions  which 
will  achieve  a  specified  set  of  goals  Typically,  the  problem  solver  only  has  to  achieve 
these  goals  given  that  certain  conditions  hold  in  the  initial  world  state.  This  collection 
of  actions  is  called  a  "plan"  and  consists  of  several  forms  of  information-  First,  a  set 
of  sub- steps  and  their  behavioral  descriptions;  Second,  a  set  of  constraints  on  the 
ordering  of  sub- step  execution;  Third  a  means  of  propagating  information  between  the 
sub-steps.  Finally,  and  most  importantly,  the  plan  includes  an  explanation  of  how  the 
segments  interact  to  achieve  the  desired  goals  Notice,  however,  that  the  sub-steps 
are  not  necessarily  primitive  actions,  the  problem  solver  may  have  to  attempt  their 
solution  recursively. 

The  earliest  planning  systems  used  the  paradigm  of  heuristic  search  in  which  the 
problem  solver  repeatedly  tries  to  take  a  single  step  from  its  current  world  state  to 
another  state  which  is  hopefully  closer  to  the  goaL  This  approach  was  used  in 
systems  like  GPS  [Newell  kel  al,  1959]  and  STRIPS  [Fikes  &  Nilsson,  1971]; 
however,  it  was  found  to  lack  sufficient  power  for  task  of  even  moderate  complexity. 
Minsky's  suggestion  of  the  notion  of  "islands"  [Minsky,  1961),  how-ever,  led  to  a  new 
paradigm  which  was  partially  embedded  in  the  PLANNER  language  [Hewitt,  1972] 

The  key  insight  in  PLANNER  is  that  a  reasonably  knowledgeable  problem  solver 
will  often  recognize  the  form  of  the  answer;  having  done  so  it  can  propose  a  partially 
instantiated  plan  immediately.  Such  plans  are  used  as  the  starting  point  of  the 
solution  with  the  problem  solver  recursively  attempting  to  synthesize  a  plan  for  the 
sub-steps  by  recognize  the  form  of  their  answer.  This  method  of  problem  solving 
(called  Ptann;ng  by  Recognition  of  The  Form  of  The  Answer)  was  formalized  in  the 
Planner  programming  language  and  was  improved  upon  in  its  descendants  Conniver 
and  QA4. 


For  Complex  Program  Understanding 


24  An  Engineering  Theory  of  Evolutionary  Design 

Within  particular  domains  of  expertise  the  paradigm  of  Planning  by  Recognition 
of  the  form  of  the  answer  is  facilitated  by  the  development  of  an  engineering 
vocabulary  which  conveniently  captures  the  abstract  form  of  most  problems.  In 
particular,  within  programming  domains  one  often  can  identify  an  “intermediate 
vocabulary"  of  programming  abstractions  which  constitute  the  building  blocks  out  of 
which  a  large  percentage  of  the  known  techniques  of  particular  domains  are  built 

When  viewed  from  the  perspecti\e  of  analysis,  the  Recognition  Paradigm  takes 
the  form  of  Analysis  by  Inspection  As  a  planning  paradigm  Recognition  decomposes 
the  problem  into  a  pattern  of  sub-steps  by  recognizing  the  form  of  the  problem.  As 
an  analytic  technique,  Inspection  reconstructs  the  form  of  the  problem  by  recognizing 
the  pattern  of  sub-steps  in  the  device.  Both  of  these  rely  on  the  existence  of  a 
powerful  body  of  "standard  plans"  which  reflect  the  common  ways  of  achieving  those 
goals  whose  form  is  understood  The  existence  of  this  body  of  knowledge  reduces  the 
heavy  cognitive  cost  of  heuristic  search  to  the  much  less  burdensome  price  of  searching 
a  "plan  library". 

However,  even  the  paradigm  of  Planning  by  Recognition  does  not  adequately 
model  human  problem  solving  behavior  on  very  complex  tasks.  Yet  another  paradigm, 
that  of  Planning  In  an  Abstraction  Space  (Sacerdoti,  1973]  must  be  added  In  this 
paradigm  we  add  to  the  above  notions  a  further  idea,  that  of  modeling.  An 
abstraction  space  is  a  model  of  the  real  world  in  which  some  important  details  are 
intentionally  (or  otherwise)  omitted  Planning  is  first  attempted  in  such  an 
Abstraction  Space.  If  a  completely  developed  plan  is  formed  in  the  abstraction  space, 
then  the  process  advances  to  an  attempt  to  modify  the  plan  to  function  in  a  less 
abstract  space. 

Notice  that  this  recursion  of  planning  and  refining  in  a  hierarchy  of  abstraction 
spaces  is  a  different  recursion  than  the  recursive  invocation  of  the  problem  solver  on 
the  subgoals.  In  the  later  recursion  the  metric  is  the  size  of  the  task,  in  the  former  it 
is  the  accuracy  of  the  modeling  space.  An  important  consequence  of  this  paradigm  is 
that  as  one  proceeds  through  increasingly  accurate  models,  a  new  pian  is  formed  by 
incremental  modification  of  the  plan  produced  by  the  preceding  stage.  One 

implementation  of  this  paradigm  was  embodied  in  the  Abstrips  program 
[Sacerdoti,  1973],  a  descendant  of  Strips.  Comparisons  between  the  two  programs 
showed  that  Abstrips  could  outperform  Strips  by  a  factor  of  4;  as  the  problems  grew 
harder  the  difference  between  the  two  systems  became  even  more  pronounced. 


Dependency  Directed  Reasoning 


13  Why  1$  Evolutionary  Design  Necessary?  25 


Abstrips,  however,  had  only  a  very  weak  method  of  modeling  the  real  world;  its 
only  abstraction  consisted  of  weakening  the  preconditions  of  its  built-in  operations. 
Thus,  its  only  debugging  technique  consisted  of  splicing  set-up  steps  into  the  abstract 
plaa 


Sussinan's  Hacker  also  identified  a  second  reason  for  the  indispensability  of 
debugging.  Suppose  that  a  problem  solver  is  presented  with  a  goal  for  which  it  has  no 
plan  in  its  library.  In  this  case,  the  problem  solver  shouid  attempt  to  reformulate  the 
problem  statement  so  that  it  can  be  decomposed  into  parts  whose  solutions  can  be 
found  by  Recognition.  However,  when  this  decomposition  is  made  there  is  always  the 
possibility  of  destructive  interference  between  the  plans  for  the  various  sub-parts. 
Furthermore,  until  one  has  found  plans  for  each  sub-goal  separately,  one  cannot  tell 
whether  they  interact.  Inherently  one  is  faced  with  the  need  to  debug  the  total 
solution  to  remove  destructive  interference  between  the  sub-plans. 

I  believe  that  these  paradigms  explain  the  mechanisms  used  by  people  to  manage 
the  complexity  of  planning  in  large  and  complex  domains.  One  first  constructs  a 
mental  model  of  the  domain  in  which  many  details  have  been  omitted.  This  produces 
a  search  space  of  considerably  smaller  size  in  which  it  is  computationally  feasible  to 
derive  a  plaa  This  plan,  like  every  other,  has  a  "proof  of  correctness"  (or  an 
explanation  of  how  it  achieves  its  goals);  however,  this  "proof  of  correctness"  might 
actually  be  incorrect  since  it  depends  upon  assumptions  in  the  model  which  may 
violate  facts  in  the  real  world  Nevertheless,  these  fictions  in  the  modeling  process  are 
extremely  valuable;  without  them  the  complexity  of  the  problem  would  prevent  one 
from  building  a  plan  at  all  This  "almost  right"  plan  is  refined  by  developing  a  more 
accurate  model  of  the  situation  and  then  using  the  current  "proof  of  correctness"  to 
guide  the  debugging  process.  As  the  Abstrips  program  indicated,  developing  the  plan 
in  an  abstraction  space  and  then  debugging  it  is  a  computationally  cheaper  option  than 
attempting  to  develop  a  correct  plan  directly.  It  is  for  this  cognitive  reason  that 
software  must  be  designed  in  an  incremental,  evolutionary  manner. 

If  computer  based  design  aids  are  to  be  of  assistance  to  software  system 
designers,  they  must  take  cognizance  of  the  nature  of  the  design  process  which  I  have 
outlined.  Design  aids  must  satisfy  two  criteria:  First,  they  must  be  able  to  reason 
about  abstract  plans  and  their  hierarchical  structure.  Given  any  world  model  the 
design  aid  must  be  able  to  check  whether  a  proposed  design  will  achieve  its  goal. 
Since  the  plan  development  process  is  a  recursive  one  in  which  the  sub-steps  of  a  plan 
are  themselves  candidates  for  plan  synthesis,  the  design  aid  must  be  able  to  understand 


For  Complex  Program  Understanding 


26  An  Engineering  Theory  of  Evolutionary  Design 

a  proposed  plan  even  if  its  sub- steps  have  not  yet  been  designed.  However,  the 
constraints  imposed  on  these  sub-steps  must  be  remembered  so  that  they  can  be 
checked  when  the  plans  for  the  sub-steps  are  formulated 

The  second  major  criterion  that  such  a  system  must  meet  is  its  ability  to  deal 
with  plan  editing,  modification,  and  debugging  A  plan  is  initially  developed  to  work 
under  the  assumptions  of  the  abstract  model;  when  these  assumptions  are  revised  to 
more  closely  correspond  to  the  real  environment  or  when  the  environment  itself 
changes,  the  logic  of  the  original  plan  must  be  examined  to  see  what  dependencies  are 
no  longer  valid.  Thus,  the  design  aid  must  be  a  dependency  based  reasoning  system 
capable  of  sophisticated  belief  revision  processing 

The  problem  of  managing  evolutionary  design  faces  engineers  in  all  discipline* 
But  it  is  particularly  acute  in  computer  science  for  two  reasons.  First,  computer 
science  is  a  young  field  without  the  maturity  and  experience  of  civil,  mechanical  or 
electrical  engineering  In  a  sense  there  is  as  yet  no  engineering  discipline.  Secondly, 
software  engineers  deal  with  a  peculiar  problem  in  that  the  major  constraints  one  deals 
with  are  not  physical  but  social.  Since  social  phenomena  are  more  transient  than 
physical  laws,  the  modeling  process  in  software  system  design  is  unusually  hard  and 
inaccurate.  This  suggests  that  software  engineering  should  look  to  the  more  mature 
engineering  sciences  which  have  developed  sophisticated  techniques  for  managing  the 
complexity  of  their  fields  We  will  see  that  the  paradigms  of  problem  solving 
developed  in  Artificial  Intelligence  research  have  their  counterparts  within  these  mature 
engineering  domains 


Section  2.4:  What  Do  Engineers  Do? 

One  might  think  that  engineering  is  mainly  concerned  with  the  optimization  of 
numerical  parameters  within  physical  systems  If  so,  computer  science  would  have 
little  to  gam  from  the  study  of  the  methodologies  used  in  engineering  Indeed, 
engineers  do  conduct  such  activity,  but  this  is  only  a  small  part  of  what  engineering  is 
about.  Engineering  is  mainly  concerned  with  limiting  the  complexity  of  analyst* 
(Bose  &.  Stevens,  1965]  give  the  following  account  of  the  engineering  exploit: 


Dependency  Directed  Reasoning 


14  What  Do  Engineers  Do? 


27 


A  physical  problem  is  never  analyzed  exactly.  This  is  a  consequence 
both  of  our  inability  to  describe  a  physical  situation  completely  and  of  the 
increasing  complexity  of  the  analysis  as  greater  accuracy  is  demanded  A 
problem  that  involves  events  in  the  real  world  is  always  approached  by 
making  simplifying  assumptions  that  hold  only  approximately,  thereby 
forming  a  model  of  the  events  under  study.  The  probh  n  then  reduces  to 
that  of  analyzing  the  model.  If  the  assumptions  by  means  of  which  the 
physical  situation  was  reduced  to  the  model  are  reasonable,  then  our 
analysis  should  produce  results  that  correspond  to  observed  events,  and  the 
same  type  of  analysis  should  be  useful  in  predicting  the  behavior  for  other 
similar  physical  situations 

I  have  identified  three  areas  of  technique  which  seem  to  be  common  to  all 
engineering  disciplines  and  which  provide  fruitful  starting  points  for  the  development 
of  a  similar  technology  for  software  engineering  These  areas  arc  (i)  The  construction 
of  “almost  accurate"  models  which  reduce  the  complexity  of  a  pure  physical  analysis 
by  introducing  tolerable  inaccuracies;  (n)  The  decomposition  of  complex  systems  into 
several  possibly  overlapping  almost  hierarchical  organizations  in  which  aspects  of  the 
behavior  of  the  whole  artifact  may  be  simply  inferred  from  the  behavior  of  the 
sub-systems,  (in)  The  development  of  a  vocabulary  of  characteristically  useful 
intermediate  constructs  which  allow  analysis  by  inspectioa 

Engineering  models  reduce  the  complexity  of  an  analysis  by  omitting  details  not 
relevant  to  the  task  at  hand  Electrical  engineers,  for  example,  use  models  of 
transistors  which  describe  their  behavior  accurately  enough  given  that  the  transistor  is 
known  to  be  operating  within  a  certain  range  of  frequencies  and  power.  Such  models 
will,  however,  produce  grossly  incorrect  result  when  used  outside  the  range  of  their 
applicability.  Thus,  in  analyzing  a  system  more  than  one  model  of  a  particular 
component  may  be  used,  each  model  explaining  the  components  behavior  within  some 
range  of  operatioa 

In  the  domain  of  programming  one  also  needs  to  model  the  behavior  of  various 
parts  of  a  s>stem.  Richard  Waters  and  I  have  developed  a  modeling  technique,  called 
temporal  abstraction,  in  which  some  aspects  of  a  system’s  behavior  are  made  quite 
easy  to  understand.  For  example  recursive  programs  can  be  temporally  abstracted  into 
a  simpler  non- recursive  programs  in  which  sequences  of  data  arc  communicated  in 
parallel  between  sub-segments.  I  will  give  an  overview  of  this  technique  later  in  this 
chapter  and  will  present  in  thoroughly  in  Chapter  8.  In  the  temporal  model  of  the 


For  Complex  Program  Understanding 


28  An  Engineering  Theory  of  Evolutionary  Design 

program  some  ordering  constraints  are  omitted  Thus,  a  second  model  corresponding 
more  closely  to  the  surface  features  of  a  program  is  also  needed 

Engineering  modeling  makes  a  trade  off  between  accuracy  and  ease  of  analysis. 
In  order  to  be  able  to  make  the  analysis  the  engineer  is  willing  to  introduce  "tolerable 
inaccuracies".  Engineers  don't  have  to  be  perfectly  correct,  only  "close  enough". 
However,  when  a  model  is  used  inappropriately  conclusions  can  be  reached  which 
exceed  the  threshold  of  tolerable  errors  One  must,  therefore,  maintain  a  record  of 
how  each  conclusion  was  reached  so  that  a  debugging  process  can  be  invoked  to 
identify  the  source  of  the  error  and  to  substitute  a  more  appropriate  model.  I  will 
present  a  program  reasoning  system  which  uses  the  Truth  Maintenance  System 
[Doyle,  1 978)  to  maintain  this  information  This  allows  our  system  to  incrementally 
reanalyze  a  prog.im  when  its  original  models  were  found  to  be  too  sweeping  in  their 
omission  of  details. 

A  second  area  of  technique  common  to  many  engineering  disciplines  is  the 
decomposition  of  larger  systems  into  a  (possibly  overlapping)  hierarchy  of  sub-systems. 
Each  sub-system  is  given  a  simple  description  which  describes  only  those  aspects  of  its 
behavior  which  are  relevant  to  other  sub-systems.  We  may  then  regard  the  whote 
artifact  as  a  loosely  coupled  network  in  which  the  behavior  of  the  whole  system  may 
be  deduced  from  the  descriptions  of  each  subsystem.  Often,  however,  it  is  necessary 
to  decompose  a  system  in  more  than  one  way  in  order  to  derive  convenient 
explanations  for  all  of  its  behavior.  In  electrical  circuit  analysis,  for  example,  one 
makes  one  decomposition  to  facilitate  the  DC  analysis  and  a  second  decomposition  for 
the  AC  analysis.  A  single  component  may  be  present  in  both  decompositions  playing 
different  roles  depending  on  which  decomposition  it  is  viewed  from 

Engineering  decomposition  techniques  include  some  of  the  most  elegant  analytical 
methods  of  all  science.  Norton  and  Thevenin's  equivalence  theorems  for  electrical 
networks  allow  one  to  decompose  any  electrical  network  into  a  collection  of 
two- terminal  devices  which  arc  accurately  modeled  by  a  single  source  and  a  single 
impedance. 

Perhaps  because  decomposition  is  such  a  basic  strategy,  it  is  also  a  relatively 
advanced  technique  in  computer  science.  The  use  of  sub-routines  as  procedural 
abstractions  which  are  described  by  their  input-output  behavior  is  well  established 
Data  abstraction  techniques  allow  a  second  type  of  decomposition.  Typically,  theae 
techniques  are  embodied  in  the  features  of  a  programming  language.  While  I 


Dependency  Directed  Reasoning 


14  What  Do  Engineers  Do?  29 


recognize  the  significance  of  such  efforts  I  also  note  a  drawback.  Analysis  frequently 
requires  multiple  decompositions  of  a  single  system;  however,  a  programming  language 
requires  that  the  system  be  represented  by  a  single  decomposition  which  is  most  often 
correlated  with  the  imperative  structure  of  the  system. 

The  third  major  type  of  engineering  methodology  involves  techniques  to  facilitate 
analysis  by  inspectioa  For  each  design  problem  an  engineer  must  establish  the  form 
of  the  answer.  Frequently  the  most  powerful  aspects  of  an  engineering  discipline  exist 
to  facilitate  analysis  by  inspectioa  In  electrical  engineering,  for  example,  the  notion 
of  complex  impedance  allows  the  inspection  techniques  which  were  first  developed  for 
resistive  circuits  to  be  applied  to  circuits  involving  inductances  and  capacitances  as 
welL  Thus,  a  single  set  of  abstract  forms,  such  as  the  notion  of  a  voltage  divider, 
can  be  applied  to  a  much  broader  class  of  circuits  Without  this  technique,  the  far 
more  complicated  methods  of  differential  equations  would  be  required. 

Given  such  techniques  it  becomes  possible  to  catalogue  the  various  forms  of 
problems  and  their  t>  pical  solutions.  This  is  done  by  developing  a  craft  or  engineering 
discipline  with  an  associated  vocabulary  of  macroscopic  constructs.  Although  there  are 
virtually  an  infinite  number  of  combinations  of  the  primitive  objects  of  any  discipline, 
most  of  these  are  not  useful.  However,  a  much  smaller  set  of  combinations  turn  out 
to  have  sweeping  power  within  particular  domains.  These  form  the  "standard  plans" 
of  a  domain;  they  are  the  terms  of  the  engineering  vocabulary.  Lisp  programmers,  for 
example,  have  a  relatively  rich  vocabulary  including  ideas  like  "cdring  down  a  list", 
“tree  traversal",  "searching  a  sequence  of  values",  "consing  up  an  answer",  etc.  In 
chapter  9,  I  will  discuss  the  process  of  analysis  by  inspection;  chapter  10  will  present  a 
brief  catalogue  of  some  useful  data  structures  and  chapter  13  (in  passing)  will  present 
a  description  of  some  typical  procedural  plans. 

An  engineering  approach  works  with  such  higher  level  notions  since  such 
descriptions  reduce  the  complexity  of  making  sense  of  a  device.  The  various 
techniques  which  have  been  mentioned  so  far  interact  to  allow  an  analysis  to 
decompose  the  system  into  components  whose  behavior  conveniently  explains  the 
behavior  of  the  whole.  For  example,  the  construction  of  a  temporal  model  allows  the 
system  to  be  decomposed  in  a  manner  which  separates  the  process  of  generating  a 
collection  of  objects  from  the  process  which  consumes  these  objects.  Once  this 
decomposition  is  performed,  it  frequently  becorr  ,  trivial  to  analyze  the  components  by 
inspectioa  In  this  case,  we  have  an  interaction  between  modeling,  decomposition  and 
recogmtioa  In  chapters  12  and  13  I  will  show  another  modeling  technique  which 


For  Complex  Program  Understanding 


30  An  Engineering  Theory  of  Evolutionary  Design 


similarly  reduces  the  complexity  caused  by  allowing  side  effects  on  shared  data 
structures. 

By  using  these  techniques,  it  becomes  possible  to  impose  a  very  rich  structure  on 
a  program.  This  structure  includes;  several  decompositions  with  mappings  between 
them,  simplifying  modeling  assumptions,  and  recognition  mappings  which  explain  how  a 
particular  fragment  corresponds  to  a  prototype  from  the  plan  library.  This  vast 
quantity  of  information  is  unified  by  the  use  of  a  dependency  based  reasoning  system 
which  records  all  logical  dependencies  which  it  discovers  in  a  special  data  base.  These 
dependencies  may  then  be  consulted  at  any  time  to  discover  the  possible  ramifications 
of  any  proposed  modificatioa  My  thesis  is  that  the  techniques  outlined  above 
facilitate  an  analysis  in  which  any  change  which  a  programmer  would  regard  as 
evolutionary  is  localized  within  the  module  boundaries  of  at  least  one  decomposition. 
Once  a  modification  is  localized  within  some  decomposition  the  task  of  assessing  the 
impact  of  the  change  becomes  cognitively  manageable  since  the  decomposition  renders 
irrelevant  all  but  a  small  fraction  of  the  information 

In  the  remainder  of  this  chapter  I  will  present  a  somewhat  more  detailed 
overview  of  the  techniques  which  I  have  developed  along  these  lines. 

Seetion  2.5:  Plans  and  Teleology 

Whether  designing  or  analyzing  a  device,  an  engineer  must  have  a 
representational  system  within  which  it  is  possible  to  utilize  and  coordinate  infoimation 
derived  through  the  techniques  described  ahvt  In  most  engineering  disciplines  there 
is  a  notion  of  the  "design  plan"  which  forms  a  skeleton  around  which  all  of  this 
information  is  arranged.  Of  all  the  issues  discussed  so  far,  the  design  plan  is  the  one 
least  well  addressed  by  other  current  work  in  computer  science. 

In  traditional  engineering  or  software  engineering,  the  behavior  of  a  device  can 
be  described  in  two  ways.  Some  properties  of  a  device  are  independent  of  its  context 
of  use.  These  properties  constitute  the  intrinsic  description  of  the  device.  The  LISP 
function  appup  can  be  described  intrinsically  by  its  input-output  behavior  of  returning 
the  concatenation  of  its  arguments  A  device  may  also  be  described  by  its  role  or 
purpose  in  the  plan  for  a  larger  mechanism.  This  is  its  extrinsic  description  appiao, 
for  example,  may  be  used  to  produce  the  union  of  two  disjoint  sets  represented  as 
lists. 


Dependency  Directed  Reasoning 


2.5  Plans  and  Teleology  31 


A  single  part  may  have  several  extrinsic  descriptions  corresponding  to  multiple 
needs  that  it  satisfies  in  the  larger  mechanism.  A  copying  garbage  collector  such  as 
[Minsky,  1963)  uses  the  same  array  of  space  as  both  the  destination  for  reclaimed  cells 
and  as  a  queue  m  a  breadth  first  tree  traversal  of  the  space  of  used  cells.  There  may 
also  be  several  plans  for  a  given  device,  describing  its  structure  in  different  dimensions. 
In  this  situation,  each  part  has  the  potential  for  one  or  more  roles  in  each  plan. 

The  essence  of  understanding  a  mechanism  is  knowing  the  purposes  of  each  part 
This  involves  building  a  description  of  the  mechanism  which  matches  each  part  with 
its  roles  in  the  appropriate  plans.  Each  role  in  each  plan  must  be  filled  by  some  part 
of  the  mechanism  and  the  intrinsic  properties  of  that  part  must  satisfy  the  extrinsic 
properties  of  its  roles. 

Certain  plans  or  plan  fragments  can  appear  as  part  of  the  plans  for  many 
different  devices.  For  example,  the  depth  first  tree  traversal  plan  fragment  appears  in 
several  of  the  modules  cotied  in  the  scenario  However,  understanding  the  teleological 
structure  of  a  plan  fragment  (which  may  be  very  difficult)  need  only  happen  once. 
Any  properties  of  the  plan  fragment  which  can  be  discovered,  are  known  to  hold 
wherever  the  plan  is  used.  These  common  plan  fragments  serve  as  an  "engineering 
vocabulary". 


Seotion  2.6:  Representing  Plans 

Supporting  a  programmer  during  design  evolution  requires  the  apprentice  to 
reason  about  program  designs  before  they  have  been  committed  to  code.  This  requires 
the  apprentice  to  have  a  program  representation  which  is  independent  of  the  choice  of 
programming  language.  In  our  Master  Thesis,  Charles  Rich  and  I 
[Rich  &  Shrobe,  1976]  presented  such  a  representation  for  abstract  programs  which  we 
called  plant.  We  reasoned  that  programs  like  other  engineered  artifacts  should  have  a 
simple  underlying  conceptual  structure  consisting  of  a  decomposition  into  parts  and 
means  for  communication  between  these  parts.  When  we  specialized  this  observation 
to  programs,  we  observed  that  the  functions  performed  by  programming  language 
primitives  fall  into  two  categories  which  might  be  called  "actions"  and  “connective 
tissue".  Actions  are  modules  which  operate  on  a  set  of  input  data  objects,  yielding  a 
set  of  new  and  modified  output  objects  Connective  tissue  arranges  the  flow  of  data 
and  control  between  the  actions 


For  Complex  Program  Understanding 


32  An  Engineering  Theory  of  Evolutionary  Design 

We  then  designed  a  formalism  based  on  this  observation.  The  formalism  consists 
of  segments,  control  flow  links,  data-flow  links,  and  abstract  data  objects.  An 
abstract  program  is  represented  as  a  set  of  segments  connected  by  data  and 
control  How  links  which  specify  how  information  propagates  between  the  segments  and 
which  partially  constrain  the  execution  of  the  sub-segments.  Segments  are  actions; 
they  are  used  to  represent  the  sub  steps  of  the  program.  Segments  may  be  nested  one 
within  the  other  yielding  a  super  segment  and  sub-segment  relationshipi 

Each  segment  has  a  set  of  local  names  for  its  input  object  and  a  second  set  of 
local  names  for  its  output  objects;  these  names  may  be  thought  of  as  "ports".  A 
Data-flow  link  is  a  directed  connection  between  the  ports  of  two  segments.  Typically 
the  connection  is  made  between  the  output  port  of  one  sub-segment  and  the  input 
port  of  a  second  subsegment,  indicating  the  output  object  named  by  the  first 
sub- segment’s  port  will  flow  to  the  input  port  of  the  second  sub-segment  A  dataflow 
link  may  also  connect  the  input  port  of  a  super-segment  to  the  input  port  of  one  of 
its  sub  segments;  finally  a  dataflow  link  may  connect  the  output  port  of  a  sub-segment 
to  the  output  port  of  its  super  segment  Data-flow  links  imply  an  ordering  of 
execution;  a  segment  which  terminates  a  data  flow  link  cannot  begin  execution  until 
the  datum  is  available  at  the  initiating  port  of  the  segment  Control-flow  links  are 
directed  connections  between  two  segments,  implying  that  the  first  segment  must 
terminate  before  the  second  segment  may  begin  A  plan  consisting  of  segments  and 
these  two  types  of  flow  links  may  not  completely  constrain  the  ordering  of 
sub-segment  execution  Thus,  as  observed  in  [Sacerdoti,  1975]  plans  are  non-linear. 
They  are  inherently  a  two  dimensional  structure  the  linearization  of  which  accounts 
for  most  of  the  complication  of  language  design 

The  plan  formalism  is  intended  to  represent  designs;  however,  these  designs 
eventually  turn  into  code  in  some  particular  language.  A  technique  called  surface  flow 
analysis  was  developed  to  bride  the  gap  between  the  two  forms  of  analysis.  Primitives 
such  as  if -TN<«-CLSt,  wm,  variables  assignment,  argument  passing,  etc  which  are 
concerned  solely  with  ordering  and  communication  are  translated  into  data  and  control 
flow  links.  Other  primitives  such  as  arithmetic  operations,  coas,  exa,  cox,  etc  are 
translated  into  segments  Such  surface  flow  analyzers  have  been  developed  for  LISP 
(Rich  A  Shrobe,  1976)  and  FORTRAN  (Waters,  1978]. 


Dependency  Directed  Reasoning 


2.6  Representing  Plans  33 


During  design  a  segment  represents  one  step  of  a  problem  decompositioa 
Therefore,  a  means  is  required  to  specify  abstractly  what  a  segment  does.  This  is 

done  by  stating  the  segment's  specs  which  consists  of;  (i)  a  set  of  input  names  (ii)  a  set 

of  preconditions  which  must  hold  immediately  prior  to  program  execution  (iii)  A  set  of 
output  names  (i\)  A  set  of  post -conditions  which  are  guaranteed  to  hold  immediately 
following  the  segment's  execution.  Alternatively,  one  can  specify  what  a  segment  does 
by  stating  its  plan,  i.e.  by  presenting  its  decomposition  into  sub- segments. 

Segments  may  ha\e  a  conditional  structure  which  is  stated  by  breaking  the 

segments  up  into  cases.  I'ach  case  is  applicable  under  certain  circumstances  which  are 
stated  m  the  segment's  sf>ecs.  Control  flow  links  can  be  attached  to  a  particular  case 
of  a  segment;  the  segment  which  terminates  such  a  link  is  executed  only  if  the 

particular  case  is  applicable.  This  creates  mutually  exclusive  control  paths  which  can 
be  united  by  a  join  segment. 

The  plan  formalism  can  he  interpreted  by  a  symbolic  evaluator  which  is  in  many 
regards  quite  similar  to  a  LISP  interpreter.  However,  the  symbolic  interpreter  uses 
typical  or  symbolic  data  as  input  1  herefore,  it  must  explore  all  control  paths.  In 
addition  it  must  u'-e  a  reasoning  system  to  deduce  whether  the  pre-conditions  of  each 
segment  are  satisfied.  The  symbolic  evaluator  is  described  in  chapter  (v 

I  will  often  present  plans  using  a  graphical  formalism.  The  symbols  in  this 
formalism  are  shown  Mow. 


SEG- 1 


A  Normal 
Segment 


TEST  1 

A  Segment 
with  Cases 


JOIN! 

A  Join 
Segment 


Data  Control 

Flow  Flow 


Sub  I 


Main 

Seg 


Sub -Segment 


Subl  l 


Recursive 


Ha  In 
Seg 


Nesting 


Nesting 


For  Complex  Program  Understanding 


34  Aii  Engineering  Theory  of  Evolutionary  Design 

As  an  example  of  how  this  formalism  is  used  consider  a  programmer  designing 
the  symbol  table  for  a  block-structured  language  like  ALGOL-6Q  A  hash  table 
might  be  used  to  store  and  retrieve  the  symbols  efficiently.  Each  symbol  is  given  a 
new  entry  in  the  table  when  it  is  first  encountered;  as  the  symbol  is  encountered  in 
new  blocks,  the  entry  is  marked  with  the  sioci-io  of  the  new  block. 

To  achieve  a  simple  action  such  as  marking  a  symbol  with  a  block  identifier 
(noct  10)  several  other  operations  such  as  hash- tasuiookuo,  soatioiasiat,  aplaca,  etc  are 
called  upon.  These  sub- actions  interact  to  achieve  the  desired  goal  of  having  the 
symbol  table  indicate  that  the  specified  symbol  is  defined  in  the  indicated  block.  This 
is  done  as  follows  First,  has«-i»»u-iooxv»  is  called  to  see  if  the  symbol  is  defined  in 
the  table.  If  it  is,  the  entry  returned  by  hasa-iahi-iookup  is  passed  to  oaoieto-iasttT 
which  inserts  the  sioct-to  of  the  specified  block  into  the  entry's  list  of  scock-io’s.  If 
the  symbol  has  no  entry  in  the  table,  niv  cntiy  is  called  to  build  a  new  hash-table 
entry,  the  new  entry  is  created  with  a  hock  10  list  including  exactly  the  specified  scock- 
io.  This  new  entry  is  passed  to  •usN-tAsu-nstiT,  which  inserts  it  into  the  tabic 

This  can  be  diagrammed  as  follows: 


Dependency  Directed  Reasoning 


16  Representing  Plans  35 


Nark 

Present 


Plan  Diagram  For  Mark  Present  Operation 

m*sh  uni  ioo«ur  has  a  case  structure;  it  perforins  a  test  and  splits  control  into  several 
paths,  depending  on  the  result  of  the  test  The  two  control  paths  are  rejoined  by  the 
totn  segment  joi*  tami.  Notice  that  crossed  lines  show  the  flow  of  control  between 
segments;  normal  lines  shows  the  flow  of  specific  data  objects 

Notice  that  many  of  ihe  modules  used  to  build  ►•isrtn  will  eventually  have 
internal  structure  of  their  own.  e»Dttto  i»si»t,  for  example,  will  probably  consist  of  a 
simcn  loo*,  a  cons,  and  a  •n»co.  The  hash-table  routines  will  involve  steps  such  a  mash, 
•uc«it  riTfM,  etc  Thus  the  structure  given  above  is  a  layered  one  nesting  boxes  within 
boxes  until  one  fmalls  reaches  programming  language  primitives. 

Each  segment  in  the  plan  above  can  be  thought  of  as  promising  that  certain 
conditions  will  hold  after  its  execution  as  long  as  its  preconditions  are  satisfied.  Such 
sets  of  promises  are  stated  using  the  specs  formalism.  As  mentioned  above,  specs  have 
four  clauses.  Two  of  these,  inputs  and  outputs  provide  a  list  of  input  and  output 
names  which  are  bound  to  the  actual  inputs  of  the  segment  The  other  two  clauses 


For  Complex  Program  Understanding 


36  An  Engineering  Theory  of  Evolutionary  Design 


are  the  expect  and  assert  clauses  which  state  the  pre-conditions  and  the  post-conditions 
of  the  segment  A  typical  set  of  specs  looks  like 


(<•» imci  fitcfc-kuckit 

( Input |  tlkll-l  India  - 1 ) 

( ( apic t  (objlct-typi  tlkll-l  huh-tlktl) 

(  mdia  tlkll-l  India  - 1 ) ) 

(Outputs  buctit-1) 

(Aitirt  (buckit  tab'al  inkli-l  kucklt-1))) 


which  states  that,  given  a  well-formed  hash-table  and  an  index  of  that  table,  htch- 
•ucmt  will  return  the  bucket  of  the  table  indexed  by  the  input  ihdcx-l 

Notice  that  since  the  specs  for  a  segment  only  refer  to  the  segment's  I/O 
behavior,  it  can  apply  to  any  segment  which  accomplishes  the  behavior  required. 
Thus,  a  specs  is  a  type  applying  to  different  algorithms  for  the  same  function.  The 
square  root  specs  describe  a  program  using  Newton's  method  as  well  as  one  which  usea 
the  halving  method.  It  is  also  important  to  understand  that  the  specs  formalism  is  a 
local  and  intrinsic  description,  saying  what  a  segment  does,  not  why  it  does  it  Specs 
have  no  notion  of  method  or  purpose  within  them. 

However,  for  an  engineered  device  to  function  properly,  it  is  necessary  that  the 
pattern  of  interactions  between  sub- modules  guar^rrees  that  every  module's 
expectations  be  satisfied  at  the  tune  of  its  invocatioa  Further,  the  pattern  of 
interactions  must  guarantee  that  the  desired  behavior  of  the  whole  device  will  result 
from  the  behaviors  of  the  parts  It  is  only  within  this  more  global  and  extrinsic 
description  that  a  notion  of  purpose  is  found  For  example,  in  a  hashing  system  we 
can  talk  about  the  purpose  of  the  hashing  step  it  computes  the  index  of  the  bucket  in 
which  the  desired  object  should  be  found,  eliminating  the  need  to  search  through 
other  buckets  which  cannot  contain  the  object  Similarly,  in  a  msm-task- juist  routine 
the  purpose  of  the  tisr  tuttr  routine  is  to  splice  the  element  into  the  appropriate 
bucket  so  that  it  will  be  a  member  of  the  table. 

A  plan  consists  of  a  pattern  of  sub-segments  connected  together  by  data  and 
control  flow  links  Two  kinds  of  requirements  are  found  in  a  plan.  First  there  are 
the  requirements  that  each  sub-segment's  expect  conditions  must  be  satisfied;  this  is 
called  a  pre  requisite  requirement  Second  is  the  requirement  that  the  overall  goals  of 
the  main  segment  must  be  satisfied;  this  is  called  an  achieve  requirement  The  first  of 


Dependency  Directed  Reasoning 


2.6  Representing  Plans  37 


these  requirements  is  indicated  by  the  expect  clauses  of  the  sub-segment’s  specs;  the 
second  is  indicated  by  the  assert  clauses  of  the  main  segment’s  specs.  If  the  plan 
represents  a  reasonable  design  then  it  is  possible  to  show  how  the  behavior  of  the  sub¬ 
segments  interact  to  satisfy  these  requirements  It  is  possible  to  summarize  such  an 
argument  so  that  it  only  refers  to  basic  units  of  description,  the  spec  clauses  of  those 
sub-segments  involved  in  guaranteeing  that  the  requirement  is  satisfied  These 
summarized  arguments  are  called  purpose  links 

Consider  the  following  diagram  for  a  mash- run  mstkt  routine,  hash  is  called  with 
the  table  and  the  object  to  be  inserted  as  arguments  and  calculates  an  index  of  the 
table.  Fetch  bucket  is  called  with  this  index  and  the  table,  producing  a  bucket  of  the 
table;  the  bucket  must  be  a  linked  list.  Finally,  the  bucket  and  the  object  are  passed 
to  list  lNsiAt  which  side  effects  the  list,  inserting  the  object  into  the  list  This  causes 
a  derived  side  effect  to  the  table;  since  one  of  its  parts  is  side  effected,  the  table  is  as 
welL  The  updated  table  is  returned  as  an  output  of  hash- tabu -lasiar  The  pre-requisite 
and  achieve  conditions  are  indicated  on  the  side  of  the  diagram. 


Hash-Table 

Insert 


Pre-requisite:  calculates  a  valid 
index  so  that  bucket  fetch 
can  function  properly 


Achieve:  Insert  the  object  In  the 

correct  bucket  so  that  It  will  be 
a  valid  amber  of  the  table  and 
so  that  the  table  will  continue  to 
be  well  forrnd. 


For  Complex  Program  Understanding 


38  An  Engineering  Theory  of  Evolutionary  Design 

Diagram  For  Hash-Table  Insert  Routine 

We  can  look  at  the  specs  of  the  sub-segments  to  see  how  the  purpose  links  are 
developed.  The  specs  for  hash  ares 


(daftpact  hath 

(inputt  tha-tabla  tha-ebjact) 

(•■pact'.  (objact-typa  tha-tabla  haih-tabla)) 

(outputi  tha-indai) 

(•start:  (objact-typa  tna-mtfat  nuabar) 

(  mdai  tha-tabla  tha-lnda*))) 

We  have  already  seen  the  specs  for  fetch-rucmt  above  Notice  that  the  second  assert 
clause  of  hash,  implies  that  the  second  expect  clause  of  fitch-sucut  is  satisfied.  Now 
let  us  look  at  the  specs  for  hash- tabu- issiat 

(daft pact  hath  tab)*- intart 
( inputt  an-objact  th«  tabia) 

(••pact  (objact-typa  tha-tabla  hath- tabia ) ) 

(outputi  (( ttia -updated- tabia  id-to  tha-tabla))) 

(•■tart  (tida-affact  tha-tabla  (ataaibar  tha-updatad-tabla  an- object ))) ) 

The  use  of  10  to  in  an  outputs  clause  indicates  that  the  output  the-urdated-tasie  is  the 
same  object  as  the  input  the-ta»ic.  The  side-effect  asseat  clause  indicates  that  tme-tasu 
is  changed  to  include  a*  object  as  a  member.  If  we  look  at  the  specs  for  mst-irscrt 
we  will  see  how  this  clause  is  satisfied 


(dafipact  I  lit  -  mtart 

(  mputl  in  objact  tha-liit) 

(••pact  (objact-typa  tha-liit  list)) 

(outputt  ( ( tha-updatad- 1  lit  id-to  tha-liit))) 

( at  tart  (iida-aTTact  tha-liit  (stambar  tha-updatad- 1 1st  an-abjact ) ) ) ) 

Clearly,  the  assert  clause  of  i  ist -  ihscrt  indicates  that  the  assert  of  hash*  task -insert  is 
satisfied;  however,  it  does  so  only  in  interaction  with  the  assert  clauses  of  fetcn-siicut 
and  m*sm  which  indicate  that  ah  orject  is  inserted  in  the  list  into  which  it  hashes  The 
satisfaction  of  the  assert  clause  of  hash -tasu- insert  depends  on  assert  clauses  from  each 
of  these  sub-segments. 


Dependency  Directed  Reasoning 


16  Representing  Plans  39 


We  call  these  logical  links  between  sub-segment  behavior  purpose- links,  those  links 
which  explain  how  a  sub-module's  expectations  are  met  are  called  pre-requisite  links, 
those  which  explain  how  the  overall  intentions  of  the  main  segment  are  met  are  called 
achieve  links.  The  pattern  of  purpose  links,  together  with  the  data-  and  control-flow 
links,  and  the  various  sub-segment's  specs  is  what  we  term  a  plan.  Plans  play  a 
central  role  in  the  work  of  the  programmer's  apprentice  because  they  explain  the 
teleological  structure  of  the  program:  the  reason  why  each  module  is  present  and  the 
logic  of  how  the  modules'  configuration  achieves  the  overall  goal 

The  addition  of  purpose  links  transforms  the  plan  formalism  from  an  abstract 
programming  language  to  a  design  representation  which  includes  not  only  a  set  of 
actions  to  be  performed  but  also  a  statement  of  their  teleological  structure.  Since  the 

sub- segments  of  a  plan  may  be  specified  at  a  high  level  of  abstraction  it  turns  out 

that  the  plan  formalism  can  easily  represent  abstract  teleological  structures.  As  I've 

mentioned,  there  is  a  craft  discipline  among  programmer’s  consisting  of  a  repertoire  of 
standard  methods  for  achieving  certain  types  of  goals.  There  are  standard  ways  to 
traverse  a  tree  or  a  list  structure,  and  standard  methods  for  accumulating  items  into  a 
set.  These  standard  methods  can  be  conveniently  represented  as  standard  plans,  using 
the  abstraction  powers  of  the  plan  formalism  to  capture  the  significant  generalities  of 
a  programming  domain.  The  plan  formalism  has  the  added  virtue  of  representing 
these  techniques  in  a  manner  which  is  independent  of  the  particular  programming 
language  being  used. 

Let  ine  explain  this  a  bit  more  before  going  on.  Suppose  I  had  a  set  of  objects 
represented  by  some  data  structure  and  I  wished  to  build  a  a  collection  of  all 
members  of  this  data  structure  which  satisfy  some  criterion.  One  standard  technique 
for  accomplishing  this  is  what  I  term  the  fltered-accumulation  plan.  This  plan 
consists  of  three  sub- plans.  The  first  is  an  enumeration  plan  which  generates  the 
elements  of  the  original  data-structure;  if  this  data  structure  is  a  list  then  this  plan 
would  have  a  familiar  pattern  of  "edring  down"  the  list;  if  the  data  structure  is  a 
binary  tree,  the  enumerator  would  have  the  structure  of  "car  edr"  recursion.  The 
second  sub- plan  is  a  flier  plan  which  tests  the  elements  produced  by  the  first 
sub- plan  selecting  those  which  satisfy  the  criterion.  The  final  sub-plan  is  an 
accumulation  plan  which  builds  a  new  data  structure  containing  those  elements  which 
passed  through  the  filter.  If  the  final  data  structure  is  to  be  a  list,  this  sub-plan 
would  have  the  familiar  pattern  of  "consing  up"  a  list 


For  Complex  Program  Understanding 


40  An  Engineering  Theory  of  Evolutionary  Design 


Now  consider  the  code  for  two  versions  of  this  idea.  In  the  first  version  (written 
in  ALGOL)  the  original  and  final  data  structure  are  arrays;  the  second  version  (in 
LISP)  uses  a  binary  tree  and  a  list 


u»t»«T  »rm  *(•  )•«}.  »[•)•*); 
IMftr  i ,  J . 

J  •  • 


tor  1  »  I  »U|  1  uMil  IM  do 

it  Critorio*  ( »(  1 ) ) 
tl»o«  »«nw 

i  •  J  *  »;  »(J]  :•  *CO 

es 


|  to  tun  til- met  (trot)  (Hl-Kt-I  If  oo  nil)) 

(dof un  fil-ncc-l  ( iff t  ncc) 

(cond  ((CMItfit*  (*•!«•  |rH)|  (••!«  kc  (cnni  (voluo  troo)  ncc)))) 

(to«d  ((UroHnot  tr«t)  ncc) 

(t  ( fit  -ncc- 1  (1#M  tr#«) 

(ftl-ncc-1  (rifkt  tr*»)  ncc))))) 

Even  ignoring  the  language  differences,  there  is  clearly  quite  a  bit  of  difference 
between  the  two  programs,  yet  I  have  already  claimed  that  they  are  actually  instances 
of  the  same  general  technique,  fiitered-accumulation.  The  plan  formalism  captures 
this  generality  using  temporal  abstraction  Temporal  abstraction  looks  at  the  history  of 
the  computation,  grouping  together  occurrences  of  segments  of  like  type.  For 
example,  in  the  LISP  program  there  are  recursive  invocations  of  ru-«cc-i  producing 
several  occ  rences  of  segments  of  this  type.  Similarly  in  the  ALGOL  program  the 
loop  executes  repeatedly  producing  several  occurrences  of  the  loop  Temporal 
Abstraction  aggregates  all  these  occurrences  into  a  single  new,  abstract  segment  which 
is  called  the  enumerator.-  Each  of  the  occurrence  within  the  enumerator  produces  an 
output  object.  In  the  ALGOL  program  the  output  is  the  contents  of  the  ith  array 
slot;  in  the  LISP  program,  the  output  is  the  value  part  of  the  current  tree  node. 
These  outputs  are  aggregated  into  a  new,  abstract  data  structure  called  a  temporal 
collection  (since  the  objects  are  produced  one  by  one,  the  collection  exists  across  time, 
rather  than  as  a  single  unified  data-structure). 


Dependency  Directed  Reasoning 


2.6  Representing  Plans  41 


We  may  similarly  observe  that  in  each  program  there  are  repeated  occurrences  of 
the  csiirsion  test.  These  may  be  aggregated  into  the  filler  sub-segment.  We  may 
note  that  each  program  has  a  repeated  accumulation  step  In  the  ALGOL  program 
this  consists  of  the  two  steps  of  adding  one  to  j  and  then  storing  a  quantity  into  the 
jth  slot  of  the  array  »;  in  the  LISP  program  the  accumulation  is  performed  by  the 
cons.  Again,  the  repeated  occurrences  of  these  steps  can  lie  aggregated  into  a  segment 
Once  this  is  done,  we  can  notice  that  the  filter  segment  will  contain  a  number  of 
identical  test  segments  which  ha\e  no  data  flow  between  them.  However,  from  the 
successful  case  of  each  test  segment  there  is  a  data  flow  to  a  sub-segment  of  the 
accumulation  plan.  Internally,  the  accumulation  plan  is  a  cascade  of  identical  set- 
accumulators,  each  of  which  takes  two  inputs:  (l)a  set  which  is  input  from  the 
previous  iunnnulator  and  (2)  a  new  element;  the  accumulator  produces  a  new  set 
which  includes  .ill  the  previous  elements  plus  the  new  one.  The  set  output  by  the 
final  accumulator  is  ihe  output  of  the  whole  filtered  accumulation  plan.  From  this 
viewpoint  both  programs  have  the  following  common  structure. 


For  Complex  Program  Understanding 


42 


An  Engineering  Theory  of  Evolutionary  Design 


Notice  that  at  this  level  of  description  we  have  left  many  features  unspecified. 
For  example,  we  have  not  said  what  type  of  object  is  input  to  the  enumerator  nor 
how  it  works  internally.  In  spite  of  this,  we  do  know  that  this  pattern  of  interactions 
(i.e.  this  plan)  produces  a  set  whose  members  are  a  subset  of  the  elements  contained 
in  the  enumerated  object  Furthermore,  we  know  that  this  subset  consists  of  exactly 
those  members  which  satisfy  the  criterion  of  the  Filter.  Indeed,  this  general  pattern  of 
segments  is  so  common  that  one  ought  to  recognize  it  where  it  occurs  and  immediately 
infer  that  the  output  is  exactly  this  subset  Languages  such  as  CLU 

[Liskov,  1974,77)  and  ALPHARD  [Wulf,  1974,76)  have  introduced  iterator*  and 
generators  to  make  it  easier  to  capture  these  and  similar  notions.  Temporal 
abstraction  will  be  discussed  in  detail  in  chapters  8  and  9. 

The  Apprentice  approach  to  program  understanding  is  distinct  from  the  approach 
of  program  verification  systems  like  (King,  19691  [Deutsch,  1973],  (Igarashi 
et.  aL,  1975]  In  the  Apprentice,  although  we  require  the  usual  logical  techniques  we 
do  not  focus  our  attention  on  the  primitives  of  the  programming  language  in  an 
attempt  to  write  axioms  for  their  behavior.  Instead,  we  abstract  away  from  the 
language  as  rapidly  as  possible,  building  up  higher  levels  of  abstraction  until  a  standard 
plan  such  as  filtered  accumulation  can  be  recognized. 

The  apprentice  has  several  distinct  components  which  are  involved  in 
understanding  a  program.  The  first  of  these  is  the  library  of  standard  programming 
techniques,  called  the  plan-library.  Plans,  as  we  have  seen  are  stated  in  a  language 
involving  data-  and  control-flow,  rather  than  the  primitives  of  any  particular 
programming  languages  Thus,  a  surface- flow-analyzer  must  translate  the  source  code 
of  a  program  into  the  internal  language  of  data-  and  control-flow.  This 
representation  is  grouped  into  segments  in  an  attempt  to  recognize  the  various  standard 
plans  present  in  the  program  Work  on  recognition  is  reported  on  in  (Waters,  1978] 
where  plan  building-methods  provide  initial  clues  to  segmentation  and  in 
(Rich,  1977,78)  where  a  plan-library  is  used  to  guide  a  heuristic  component  of  the 
recognition  system 

i 


Dependency  Directed  Reasoning 


16  Representing  Plans  43 


The  recognition  systems,  however,  must  call  on  a  reasoning  system  from  time  to 
time  to  see  whether  their  proposed  recognition  of  the  code  is  feasible.  In  later 
chapters  as  I  present  the  deductive  component  more  carefully  we  will  see  the  use  of  a 
task  agenda  and  an  explicit  recording  of  dependencies.  These  are  the  methods  by 
which  the  deductive  component  communicates  with  other  parts  of  the  system. 

The  deductive  component  of  the  system  plays  a  second  role  in  the  apprentice 
which  I  refer  to  as  plan verification.  The  apprentice  requires  a  large  library  of 
standard  plans  whose  properties  have  already  been  analyzed  and  recorded.  While  it  is 
a  theoretical  possibility  that  such  complete  analyses  could  be  produced  by  hand,  in 
practical  terms  this  is  prohibitive.  Instead,  the  deductive  component  of  the  system  is 
used  to  show  that  a  plan  (stated  at  any  level  of  abstraction)  satisfies  certain  properties. 
In  this  use,  REASON  is  presented  with  a  plan-diagram  consisting  of  data-  and 
control  flow  links  and  specifications  for  the  sub-segments  used  in  the  diagram.  It  is 
then  asked  to  show  that  certain  properties  hold;  often  it  is  useful  to  give  REASON  a 
set  of  lemmas  to  be  proven  first  which  will  structure  the  proof  and  make  it  more 
comprehensible.  REASON  is  also  allowed  to  ask  for  help  if  it  feels  that  it  is  getting 
lost  As  the  system  develops  the  proof,  it  records  all  its  deductions.  These  are  then 
summarized  into  the  pre  requisite  and  achieve  links  of  the  plan  which  is  filed  in  the 
plan  library  as  a  new  standard  plan. 

In  the  typical  interaction  with  the  apprentice,  as  seen  in  the  scenario,  the 
programmer  first  develops  a  design  for  a  segment  of  code,  using  the  plan  library  as  a 
shared  vocabulary  of  high  level  building  blocks.  As  these  pieces  are  woven  together, 
the  apprentice  checks  that  pre-conditions  of  each  segment  are  satisfied  and  warns  the 
programmer  of  design  bugs  if  any  precondition  is  violated.  When  the  programmer 
believes  that  a  total  plan  has  been  formulated  he  asks  the  apprentice  to  check  whether 
this  plan  does  achieve  the  intended  goals. 

In  general  the  programmer  will  not  begin  to  code  a  segment  until  he  has  gone 
through  this  design-checking  protocol  with  the  apprentice.  Having  completed  the 
design  at  the  level  of  abstract  plans,  however,  he  goes  on  to  write  the  code.  It  is  at 
this  point  that  the  surface  analyzer  and  the  recognition  components  are  called  on  to 
match  the  code  and  the  already  verified  plaa  If  the  alignment  is  made,  then  the 
programmer  proceeds  knowing  that  his  program  accomplishes  the  things  which  he  had 
asked  the  apprentice  to  check. 


For  Complex  Program  Understanding 


44  An  Engineering  Theory  of  Evolutionary  Design 

Frequently,  however,  the  programmer  may  find  it  more  convenient  to  vviite  the 

code  without  going  through  the  design  protocol.  Under  the-.e  conditions  the 

apprentice  will  have  weaker  clues  and  will  have  to  interact  with  the  picgrammer  more 
often,  asking  for  specifications  and  other  hints  to  guide  its  analysis.  In  any  event, 
once  the  analysis  is  complete  the  apprentice  will  have  constructed  a  recognition 
mapping  between  the  code  and  the  plan  for  the  segment;  in  addition  most  of  this  plan 
will  have  pointers  hack  to  plan  fragments  from  the  library.  Thus,  the  apprentice  can 
explain  the  code  using  the  high-level  vocabulary  of  the  library.  Furthermore,  the 
apprentice  will  have  developed  and  written  in  the  notebook  a  complete  explanation  of 

the  intermodule  dependencies,  giving  it  the  ability  to  examine  how  changes  to  one  of 

the  sub-segments  will  affect  the  behavior  of  the  whole  program. 

In  summary,  the  plan  of  an  engineered  device  is  a  set  of  logical  connections 
between  the  conceptual  descriptions  of  sub-modules,  the  descriptions  of  implementation 
strategy,  and  the  overall  intentions  for  the  device  being  engineered.  These  logical 
steps  explain  how  each  module  of  the  overall  device  contributes  to  the  higher  level 
conceptualization  as  well  as  why  each  sub-module  is  capable  of  functioning.  The  lack 
of  such  a  logical  connection  in  a  proposed  device  would  indicate  a  conceptual  failure 
or  design  bug.  Since  any  engineering  discipline  builds  up  a  repertoire  of  standard 
plans,  understanding  an  engineered  device  is  largely  a  matter  of  recognizing  which 
standard  plans  are  used  and  how  they  are  interfaced  to  achieve  their  intended  goals. 

Given  that  modules  of  a  device  may  themselves  be  conceptual  constructs  with 
internal  structure,  plans  provide  an  abstracting  mechanism  describing  the  structure  of 
the  device  at  a  level  appropriate  to  the  task  at  hand.  Plans  also  allow  one  to  describe 
and  reason  about  the  behavior  of  incompletely  designed  devices,  since  a  module's  net 
behavioral  specifications  may  be  used  within  a  larger  plan  even  if  there  is  as  yet  no 
interna!  plan  to  accomplish  the  behavior  of  the  sub-module. 


Dependency  Directed  Reasoning 


17  Plans  in  Maintenance  and  Explanation  45 


Section  2.7:  Plans  in  Maintenance  and  Explanation 

Plans,  as  outlined  above,  give  a  teleological  description  of  program  behavior, 
abstracted  to  a  le\e!  of  description  which  is  convenient  to  the  programmer.  It  is  a 
rather  trivial  matter  to  generate  explanations  of  a  program  from  a  plan  Since  plans 
contain  more  information  than  does  the  program  itself,  such  explanations  will  be  richer 
than  a  mere  recitation  of  the  code. 

Nl\  goal,  however,  is  to  understand  and  support  the  process  of  program  evolution 
As  I  have  noted,  plans  capture  the  relationship  between  program  design  choices, 
abstract  modularization,  and  overall  intentions.  In  doing  this,  they  localize  the  effects 
of  a  change  in  design  strategy,  and  specify  the  teleological  requirements  which  must  be 
satisfied  in  any  modification  of  the  design 

As  a  simple  case,  consider  a  hash-table  insert  routine  which  has  been 
implemented  using  ordered  linked  list  buckets  with  a  count  field.  The  code  for  such  a 
program  might  be: 


(ddfun  <«t»rt 

(  vnitrt  -  m-bwcktt  (labia  (haih  (*a/  part  it**)))  lla»|) 

(d*fun  iruM  ■  m  butlil  (bucket  ’t»») 

(do  ( (  prt» iou« • ! n I  bucktt  (cdr  prtvioul  lilt)) 

(current • 1  tit  ( c  dr  bucktt)  (cdr  Cur r»nt • 1 1 1 1 ) ) ) 

((null  current  •  I  lit )(  rplacd  prtvioul  - 1  Ut  (Hit  It**))) 

(and  (  graattr • than  (car  currant • 1 1 it )  it**) 

(rplacd  pr»*ipu»- 1  lit  (coni  lit*  current- 1  lit ))) ) 

(rplaca  bucket  (1*  (car  bucket)))) 

Suppose  that  for  space  efficiency  we  decide  to  change  to  a  rehashing  scheme. 
Since  this  change  is  strictly  a  design  issue  dealing  with  buckets,  the  plan  tells  us  that 
the  overall  structure  of  the  insert  module  itself  will  not  have  to  change,  but  that  the 
Insert-In  Bucket  module  as  well  as  the  communication  between  the  two  modules  might 
require  change.  It  further  tells  us  that  the  last  line  (ie.  the  rplaca  which  bumps  the 
count)  is  no  longer  relevant  At  first  glance  one  might  guess  that  this  is  all  the  help 
one  could  get. 


For  Complex  Program  Understanding 


46 


An  .Engineering  Theory  of  Evolutionary  Design 


However,  the  plan  library  reveals  that  there  is  more  structure  in  common 
between  the  old  and  new  designs.  In  the  library,  plans  and  data-structures  are 
organized  into  (tangled)  hierarchies  where  objects  lower  in  the  hierarchy  inherit 
properties  from  those  above  them.  In  both  implementations  of  a  hash-table  we  have 
that  the  buckets  are  linear  object t,  furthermore,  we  have  a  generalized  version  of  the 
search  loop,  called  linear  search  loop  which  can  search  any  linear-object  such  as  lists  or 
arrays.  The  more  specific  versions  of  linear- search- loop  differ  only  where  the  choice 
of  representation  for  the  particular  linear-object  is  relevant 

This  difference  appears  in  the  »uw»,  ixkaustioa,  and  hsminatiom  steps.  In  the  re¬ 
hash  scheme,  #u**  is  the  »  nun  operator  and  successful  termination  of  the  search  is 
indicated  by  a  special  marker  (such  as  nil)  indicating  that  a  slot  is  free.  Exhaustion 
of  the  search  might  be  indicated  by  the  m-nash  routine  returning  a  negative  number. 
An  item  is  made  a  member  of  a  bucket  in  the  »i  hash  scheme  by  inserting  it  in  the 
array. 


In  the  linked  lisi  version,  »u*v  is  the  co»  operation;  objects  are  selected  by  caa, 
exhaustion  of  the  list  is  indicated  by  the  presence  of  mu  Successful  location  of  a 
place  to  insert  the  object  is  indicated  by  the  presence  of  a  larger  element  in  the  next 
position  Using  this  information,  the  system  guides  the  programmer  to  the  following 
new  program  (I  will  discuss  this  idea  further  in  chapter  MY 


(  d«f  un  ( t  tr») 

( ln»«rt  •  m-bwettt  (h»»t>  ( -  part  it«a))  ltd*)) 

(dd'un  IftttMin- tucket  ( Int  t  »•!  - 1  lot  itia) 

(da  (|i1»t  mi  l  i»l  •  i  l«t  (rttAih  slot))) 

(  (  Mtnuftp  |lot)(«rr*r  '  ltd*  t  tdtt-  ItM  )  ) 

(  »"d 

(null  (tdtlf  tl»l))(iWr«  ( tdtld  slot)  itn)))) 


Typically,  the  apprentice  builds  more  than  one  viewpoint  of  the  program  during 
the  recognition  process.  During  program  modification,  one  or  another  of  these 
viewpoints  might  provide  a  perspective  from  which  the  effect  of  the  modification 
appears  quite  localized.  In  any  event,  since  the  apprentice  has  a  complete  record  of 
all  the  logical  dependencies,  it  can  easily  evaluate  whether  any  proposed  modification 
can  damage  a  desired  property. 


Dependency  Directed  Reasoning 


2.7  Plans  in  Maintenance  and  Explanation  47 


8ection  2.8:  Dependenoy  Direoted  Reasoning 

A  plan  may  he  thought  of  as  an  abstract  program  coupled  with  a  logical  analysis. 
However,  it  is  important  to  note  that  this  logical  analysis  need  not  necessarily  be  a 
“proof  in  the  sense  of  a  guarantee  of  correctness  REASON  is  capable  of 
conducting  logical  arguments  which  range  from  the  informal  or  "common  sense"  to  the 
rigorous  In  many  cases  the  plan  for  a  program  will  only  contain  a  "common  sense" 
or  engineering  analysis  which  is  inadequate  to  guarantee  correctness  under  all 
conditions,  but  which  is  good  enough  for  purposes  of  explaining  its  teleological 
structure.  When  necessary,  REASON  can  be  asked  to  verify  certain  modules  and  can 
carry  this  out  with  full  rigor.  We  often  observe  experts  making  an  analysis  in  exactly 
this  way.  First  they  conduct  a  common  sense  analysis  to  explicate  certain  facts  and  to 
establish  a  framework  of  understanding;  once  this  is  accomplished  the  framework 
guides  a  more  formal  analysis,  keeping  it  from  getting  lost  in  2  sea  of  combinatorics. 
It  would  be  desirable  for  RF.ASON  to  be  able  to  do  something  like  this. 

Another  desideratum  is  than  an  incremental  change  in  the  program  should 
necessitate  only  incremental  changes  in  the  analysis  of  the  program.  To  partially  meet 
these  desiderata  REASON  was  designed  as  a  dependency  based  system  In  a 
dependency  based  system  every  new  assertion  entered  into  the  data  base  is 
accompanied  by  a  justification  stating  which  other  assertions  form  the  logical  support 
for  the  new  one.  The  justification  itself  is  an  object  which  the  system  can  inspect 
and  manipulate. 

Assertions  in  the  reasoning  system  have  one  of  two  statuses  in  or  out.  An  in 
assertion  is  one  which  is  believed.  An  out  assertion  is  one  not  currently  believed.  A 
special  module  called  the  Truth  Maintenance  System  (TMS)  [Doyle  1978]  is  responsible 
for  guaranteeing  that  all  assertions  with  valid  reasons  to  be  believed  are  in  and  all 
assertions  lack  valid  justifications  are  out. 

REASON  has  several  uses  for  dependency  based  reasoning;  management  of 
abstract  molds  for  programs,  analysis  of  program  modifications  and  hypothetical 
reasoning  during  theorem  proving.  In  chapter  12  I  will  discuss  the  use  of  dependency 
based  reasoning  in  the  the  analysis  of  side  effects.  In  this  situation,  REASON  first 
conducts  a  simple  analysis  assuming  that  the  degree  of  sharing  between  complex  data 
structures  is  limited.  Various  desired  properties  of  the  program  are  then  proven, 
under  this  assumptioa  Sometimes  such  a  cursory  analysis  is  sufficient  However, 
when  a  more  careful  exploration  is  desired,  the  assumption  can  be  removed  and 


For  Complex  Program  Understanding 


48  An  Engineering  Theory  of  Evolutionary  Design 

replaced  by  a  more  cautious  assumption  or  by  no  assumption  at  alL  The  TMS  use* 
the  dependencies  to  determine  what  conclusions  remain  valid  under  the  new 
assumptions.  In  many  cases,  some  of  the  important  properties  of  the  program  do  not 
depend  on  the  assumption  and  remain  in  However,  if  some  property  does  in  fact 
depend  on  the  assumption  it  will  go  out  indicating  that  the  original  proof  is  not  still 
valid  under  the  conditions  of  sharing  A  more  detail  proof  can  then  be  attempted 

I  vtll  now  turn  to  the  detailed  presentation  of  the  techniques  used  in  REASON. 
In  chapter  }  I  will  first  discuss  the  reasoning  system  per  se;  chapter  4  will  introduce 
the  task  agenda  and  the  system's  method  of  explicit  control  Chapter  5  will  present 
the  program  description  techniques  in  detail  Chapter  6  presents  the  symbolic 
interpreter  for  the  plan  formalism  and  chapter  7  gives  an  example  of  how  this  can  be 
used  in  program  verification.  This  chapter  is  quite  tedious  and  can  skipped  without 
loss  of  continuity.  Chapters  8  and  9  detail  the  techniques  of  temporal  abstraction  and 
its  use  in  analysis  by  inspection.  In  chapter  10  a  language  for  describing  data 
structures  is  introduced  along  with  a  catalogue  of  data  descriptions  which  REASON 
uses.  These  descriptions  are  used  during  program  analysis  and  recognition  and  are 
important  to  the  material  which  follows  on  the  analysis  of  side  effects.  However 
Chapter  10  need  not  be  read  very  carefully  to  understand  the  material  which  follows. 
Chapter  1 1  presents  my  techniques  for  reasoning  about  side  effects  by  making 
simplifying  assumptions.  This  material  is  extremely  novel  and  quite  distinct  from 
verification  literature  on  the  same  subject.  Chapter  12  is  a  brief  discussion  of  some 
concepts  which  can  be  used  to  make  the  ideas  of  chapter  11  more  powerful  Finally, 
in  chapter  1 }  many  of  the  previous  ideas  are  combined  in  a  sketch  of  how  REASON 
will  eventually  be  able  to  support  program  evolution 


Dependency  Directed  Reasoning 


3  The  Reasoning  System  49 


Chapter  3:  The  Reasoning  System 

Understanding  programs  requires  a  sophisticated  reasoning  capability.  This  chapter 
describes  REASON'S  basic  deductive  system.  REASON  has  its  antecedents  in  two 
separate  works.  The  first  of  these  is  an  earlier  program  implemented  in  LISP  which 
was  reported  on  in  (Rich  &  Shrobe,  1 976J.  The  current  version  of  REASON  is 
written  in  a  variant  of  AMORD  {DeKleer,  et  aL  1977]  a  language  for  constructing 
problem  sobers.  Both  systems  maintain  a  dependency  network,  but  the  AMORD 
system  does  so  in  a  cleaner  manner,  utilizing  the  Truth  Maintenance  System 
(Doyle,  1978]  I  will  begin  by  reviewing  the  basic  concepts  and  constructs  of  the 
systeia 


Section  3.1:  Dependencies  and  Justifications 

REASON  is  implemented  in  a  variant  of  the  language  AMORD.  I  will  begin  by 
reviewing  the  syntax  and  basic  concepts  of  this  language: 

Imagine  a  reasoning  system  which  knew  that  numerical  ordering  is  transitive. 
Suppose  also  it  knew  that  X  was  less  than  Y  and  that  Y  was  less  than  Z. 
Presumably,  an  "ordinary  theorem  prover"  would  then  conclude  that  X  was  less  than 
Z.  However,  the  system  could  in  principle  deduce  more  than  just  this.  It  knows  that 
X  is  less  than  Z  btrause  it  is  less  than  Y  which  is  in  turn  less  than  Z  and  becaust  the 
ordering  is  transitive. 

REASON  like  some  other  newer  systems  (Doyle,  1978],  (London,  1977}  [Stallman 
..iiv  .Milan,  1977]  regards  the  justification  for  the  new  fact  as  an  object  of  great 
importance  to  the  theorem  prover  itself  The  justification  for  the  new  fact  tells  us 
what  other  facts  the  new  bet  depends  on  If  we  did  not  believe  that  X  was  less  than 
Y  or  that  Y  was  less  than  Z  or  than  numerical  ordering  is  transitive  then  we  ought 
not  to  believe  that  X  is  less  than  Z  A  justification  states  such  a  dependency  between 
facts. 


REASON'S  goal  is  not  only  to  prove  properties  of  a  program  but  to  understand 
how  these  properties  follow  from  known  or  assumed  properties  of  sub-modules.  Hence 
such  dependencies  are  a  crucial  form  of  information  in  REASON.  When  an  assertion 
is  entered  into  REASON'S  data  base,  it  is  always  accompanied  by  a  justification 
explaining  wh>  the  new  assertion  is  believed.  To  make  this  convenient,  as  each 


For  Complex  Program  Understanding 


50  The  Reasoning  System 

assertion  is  entered  into  the  system's  database  it  is  assigned  a  unique  "fact-name*1  by 
which  it  may  be  referenced.  For  example: 


•l»trtl*n  Sjfilw-SnwIlM  Fact-HwM 

(<  K  Y)  f-l 

(<  r  n  il 

The  user  may  add  the  the  fact  deduced  from  the  above  by  calling  the  ASSERT 
function: 


•(<  *  t)  t-l  »-*)) 

»ssi»T  takes  two  arguments:  the  new  asset tion  to  be  added  to  the  data  base  and  the 
justification  for  the  assertion  which  is  a  list  whose  first  element  is  a  justification  type 
and  whose  remaining  elements  are  the  fact-names  upon  which  the  new  assertion 
depends. 


(ftittrt  <r«(t>  tjrp«  n«M>  ...  (IlcI  iMM)  ...)) 


One  important  justification  type  is  mum  which,  as  the  name  suggests,  indicates 
that  the  fact  is  believed  without  further  justification.  A  mihisi  justification  has  no 
supporting  facts.  The  three  facts  about  ordering  shown  above  could  well  have  been 
entered  into  the  system  as  follows; 

U»«r  Type  I  Sjrit(*'Supp1 1t4  F  »<  t  -  *«M 


( *lt*M 

•(< 

X 

Y) 

M 

( *»»tf t 

'(< 

Y 

I) 

'(trHIii)) 

r-i 

( Itlirl 

X 

n 

' ( Trpni ittvtty  f-1  f-l) 

M 

The  rudimentary  facility  of  any  logic  system  is  a  mechanism  for  making  deductions. 
REASON  accomplishes  this  using  rules  which  consist  of  two  elements;  a  trigger- set  and 
a  body.  The  trigger-set  is  a  list  of  patterns  each  of  which  has  two  parts:  a  fact-namt- 
v an  able  and  an  assertion  pattern.  The  body  is  a  LISP  expression  which  is  evaluated  in 
an  environment  in  which  the  variables  of  the  patterns  are  bound  to  the  objects  which 
they  match.  The  following  is  a  fairly  typical  REASON  rules 


Dependency  Directed  Reasoning 


3.1  Dependencies  and  Justifications  SI 


(rulb  ((  f  (Kbit  Hit  1  Hit-?))  trij9«r 

(  g  ( H*mb«r  Illt-Z  obj-l)))  Itt 
( •ll«rt  •(•  tmbtr  lllt-1  ebj-1)  '(Ult-Mlirihip  :f  :$))) 
fact  justification 

Variables  are  indicated  by  a  leading  colon  (=). 

Here  the  body  is  the  assert  statement  The  trigger  set  is  the  list- 


<(  f  (bait  1  sat- »  llit-tl) 

(  g  (Haatbar  llit-Z  ObJ-l))) 

In  these  triggers,  the  leading  single  variable  (  »  or  s)  is  the  fact-name  variable,  the 
remaining  part  of  each  trigger  (  (»tsi  ust-i  list-*)  or  («nsit  :list-i  :0*j-d  )  is  the 
assertion  pattern.  Rules  are  dealt  with  in  3  stages 

(i)  When  an  assertion  is  added  to  the  data  base,  all  rules  with  a  trigger  whose 
assertion  pattern  matches  the  new  assertion  are  triggered. 

(ii)  Each  of  the  remaining  triggers  are  examined  to  see  if  their  assertion  pattern  also 
matches  an  assertion  in  the  data  base.  However,  these  matches  must  be  consistent 
with  the  variable  bindings  created  by  the  earlier  matches. 

(iii)  If  all  of  the  triggers  have  a  matching  assertion,  then  the  rule  is  applicable  and  its 
body  is  executed  in  the  binding  environment  created  by  the  match. 

As  each  trigger  is  matched  to  an  assertion,  the  fact-name  variable  of  that  trigger 
is  bound  to  the  fact  name  of  the  matched  assertion  This  allows  the  body  of  the  rule 
to  refer  to  its  triggering  facts.  In  particular,  assert  statements  in  the  body  of  the  rule 
may  include  a  justification  mentioning  these  facts. 

At  each  moment  any  assertion  has  one  of  two  statuses  in  REASON,  it  is  either 
in  or  out.  A  fact  which  is  in  is  believed  to  be  true.  An  assertion  whose  negation  is 
in  is  believed  to  be  false.  If  both  an  assertion  and  its  negation  are  in  then  the  data 
base  is  contradictory  and  corrective  action  is  required.  If  neither  the  assertion  nor  its 
negation  is  in,  then  the  truth-value  of  the  fact  is  simply  unknown. 


For  Complex  Program  Understanding 


52  The  Reasoning  System 


■icrtion 

nogotod  tin 

irtton 

Mining 

In 

out 

01  tort  ion  truo 

out 

in 

••••rtlon  folio 

in 

in 

Cent rod let  ion 

Out 

out 

truth  vo>u«  unknown 

Facts  which  are  out  may  be  brought  in  by  asserting  the  fact  with  a  valid 
justification.  A  fact  which  is  in  may  be  made  out  using  the  function  at  tract  to 
remove  a  valid  justification  If  no  valid  justification  is  left,  the  fact  goes  out.  When 
a  fact  goes  out  a  check  is  made  to  see  which  other  facts  are  affected  by  the  change 
of  status  of  the  first  fact.  If  the  first  facts  going  out  invalidates  the  support  of  some 
other  fact,  then  the  same  checks  are  made  recursively,  outing  all  facts  which  now  lack 
well  founded  support.  Similarly,  if  a  fact  comes  in  because  new  support  for  it  is 
discovered,  then  a  check  is  made  to  see  which  other  facts  are  affected.  Any  facts 
whose  support  is  made  valid  by  a  change  in  status  of  the  first  fact  are  brought  in. 
Such  imn g  and  oi/hng  of  facts  is  managed  by  the  Truth  Maintenance  System,  which 
insures  that  only  facts  with  well  founded  support  are  in.  Thus,  a  fact  which  has 
never  been  asserted,  is  by  definition  out. 

The  meaning  of  jus'ifications  such  as  the  transitivity  justification  shown  above  is 
that  whenever  »  i  and  m  are  in  then  r-i  should  also  be  in.  If  for  some  reason  either 
f  -  i  or  f  t  became  out,  then  r-i  would  lack  support  and  would  also  become  out  (unless 
it  has  other  support). 

It  is  frequently  necessary  to  assume  that  some  fact  holds  even  though  no  rtMoa 
exists  for  believing  the  fact.  This  is  often  done  in  hypothetical  reasoning  as  when  one 
proves  that  »  implies  t  by  assuming  *  and  deriving  s.  Since  the  assumed  fact  a  has  no 
simple  support  (as  for  example  t<  x  i)  above)  a  different  type  of  justification  is 
required  One  assumes  a  fact  by  making  it  depend  on  the  ovmess  of  its  negation; 
thus,  if  the  negation  of  the  assumed  fact  should  ever  be  proved,  the  assumption  will 
go  out.  This  is  done  by  the  function  ASSUME: 


(•lit**  (*•<•  of  Thu  Aoo«  (roon-Cheiie)  ' ( |1  II - llU- to  f-fl)) 


Dependency  Directed  Reasoning 


.1 1  Dependencies  and  Justifications  53 


which  states  that  the  system  will  believe  that  the  moon  is  made  of  green  cheese  as 
long  as  it  has  no  reason  to  believe  that  the  moon  is  not  made  of  green  cheese  and  as 
long  as  it  believes  fact  i-n  (which  presumably  is  a  statement  of  what  Bill  told  the 
system).  *ssum(  takes  two  arguments:  a  fact  to  assume,  and  a  list  whose  first  element 
is  an  assumption- type-name  (used  only  for  mnemonic  value)  and  whose  cor  is  a  list  of 
fact-names  which  indicates  the  reasons  for  making  the  assumption.  Whenever  all  the 
fact-names  in  this  list  are  in  and  the  negation  of  the  assumption  is  oul,  the  assumed 
fact  is  brought  in. 

This  requires  the  justification  built  by  »ssum  to  have  two  parts:  A  list  of 
assertions  upon  whose  mness  the  fact  depends  and  a  list  of  assertions  upon  whose 
oi/mess  the  fact  depends.  When  the  above  «sum  form  is  invoked  it  creates  the 
assertion: 


(Not  (Motft  of  ThtMoon  f>r»»r\  0««»*))  f  ill) 

and  then  justifies  the  assertion 

f  •  111 t  (Hodo  of  tht  ooo n  qrt*n-th»»n) 

by  stating  that  r-jii;  depends  on  r- ini's  outness  and  on  r-n's  mness 

f-1117  (Mod*  Of  Tho  Moon  Crt«n-c*t*l«)  (It  1 1 • lltl- id  (f -iJ >(f- 1111 )) 

The  justification  of  r  m;  is  not  the  same  list  as  the  second  argument  of  the  assume 
which  created  r  ioor.  The  justification  is  built  by  assume  using  the  information 

provided  in  its  calling  arguments  The  first  list  in  the  justification  is  the  list  of  facts 
whose  im less  supports  r  i*o;  and  the  second  list  is  the  oul  list  This  support  structure, 
which  was  originated  in  Doyle’s  TMS  jDoyle,  1978J,  allows  the  system  to  believe  that 
the  moon  is  made  of  green  cheese  until  some  deduction  provides  valid  support  for 
f  -ini.  At  that  point,  r-issi,  the  negation  of  r-issj,  will  become  in  (and  thus  not  out). 
Since  f  100;  depends  on  the  outness  of  r  mi  it  will  lack  support  and  thus  become  out 

itself.  Similarly,  if  other  facts  had  been  deduced  from  r-issi  they  would  now  lack 

support,  since  f  ihi  would  no  longer  be  in.  The  Truth  .Maintenance  system 

propagates  these  changes  of  status  until  only  assertions  with  well-founded  support 
remained  in. 


For  Complex  Program  Understanding 


54  The  Reasoning  System 

Notice  that  in  the  support  structure  for  assumptions,  adding  a  new  assertion  to 
the  data  base  (eg.  the  negation  of  an  assumed  assertion)  can  cause  an  assertion  which 
is  in  (the  assumed  one)  to  become  out.  Thus,  the  number  of  things  believed  to  be 
true  can  decrease  as  assertions  are  added.  For  this  reason,  such  a  support  structure  is 
referred  to  as  non  monoconic  When  a  contradiction  is  detected,  the  system  finds 
those  assertions  which  are  supported  by  non-monotonic  dependencies  (Le.  on  the 
outness  of  others)  and  brings  in  one  of  the  assertions  upon  whose  outness  they  depend 
Since  the  system  maintains  the  dependencies  between  facts,  it  is  easy  for  it  to  find 
onl>  those  assumptions  which  logically  are  related  to  the  contradiction  and  to  use 
these  as  the  candidates  for  rejection.  This  avoids  the  thrashing  which  was  found  to 
occur  in  chronological  backtracking  systems  such  as  Micro-Planner 
(Sussman,  et  al  1971)  and  even  in  more  flexible  systems  where  dependencies  were  not 
explicitly  maintained  This  process  is  called  dependency- directed  backtracking 
(Stallman  k  Sussman,  1977) 

Sometimes  it  is  necessary  to  make  an  assertion  depend  on  the  rnness  of  some 
facts  and  the  outness  of  a  second  set  of  facts.  This  can  be  done  by  catling  asscar 
with  a  justification  argument  whose  first  element  is  si.  Such  a  justification  should 
have  three  other  elements:  A  mnemonic  name,  a  list  of  facts  upon  whose  rnness  the 
new  fact  depends,  and  a  list  of  facts  upon  whose  outness  the  new  fact  depends.  For 
example: 


(AuaM  ’(Nectar  Mo«'t)  '(SI  NIT  paop la  -  hack  ( f - 1  >  (F-f))) 

will  create  the  new  assertion  F  3  justifying  it  so  that  it  will  be  in  whenever  f-i  is  in 
and  r-i  is  out.  Thus,  we  would  have: 


laclkaN  Aiiartian  Juilif  icat'en 

l-l  (Hacker  (hit  pee* te- Neck  (r-i)  (F-l)) 


A  final  type  of  justification  arises  in  the  proofs  of  implications.  As  mentioned 
above,  typically  one  proves  (iwuisai)  by  assuming  *  and  deriving  a.  The 
justification  of  (  i’wi.ms  »  •),  however,  is  not  logically  the  justification  of  a.  It  is, 
instead,  exactly  those  facts  which  were  involved  in  deriving  •  from  *  which  were  do 
not  themselves  depend  on  a.  For  example,  consider  the  following  trivial  proof 


Dependency  Directed  Reasoning 


3. 1  Dependencies  and  Justifications  5$ 


F  act  -  N«k« 

Attartion 

JuitiF Italian 

F  -  1 

(!■*'’•»  A  C) 

(Aramiia) 

F-J 

( Imp! if!  C  D) 

(Pr  tana) 

F  ) 

A 

( Auianptlon) 

F  -A 

C 

(Hodui-Pontni  F-I  F-l) 

f  -s 

0 

(Hodui-Pontni  F-A  F-J) 

f  t 

( l«p! '•»  A  D) 

(Conditional-Areat  F-l  F-J) 

As  I  mentioned  above,  the  logical  support  of  m  is  precisely  f-i  and  f-j.  To  calculate 
this  the  system  can  trace  back  through  the  dependencies  to  find  those  facts  which 
support  r  s,  the  consequent  of  the  implication.  These  are  »-«,  and  f-j.  r-«  is,  in  turn, 
supported  by  f-j  and  r-i.  Of  these  assertions,  we  eliminate  the  hypothesis  f-i  plus 
those  assertions  which  depend  on  it  These  are  f-i  and  r-«,  leaving  only  f-i  and  f-j 
which  are  then  the  support  of  f  ».  The  system  is  instructed  to  perform  such  an 
analysis  by  asserting  a  fact  with  a  justification  whose  justification-type  is  Conditional- 
Proof 


(Atiarl  '(l»P’'»»  *  0)  1 (Condition*! -Proof  F-S  F-I)) 

»sst»T  performs  a  special  analysis  when  given  a  second  argument  whose  first 
element  is  the  justification-type  conditional- proof.  The  first  fact-name  in  a 

Conditional  Proof  justification  is  the  consequent  of  the  implication,  the  second  fact- 
name  is  the  hypothesis  of  the  conditional  proof  argument  assist  creates  a  justification 
for  the  abo\e  assertion  in  which  the  support  for  the  assertion  is  the  set  of  facts  (such 
as  f - 1  and  f  j)  upon  which  the  implication  relies.  Thus,  if  r-i  were  to  go  our,  r-* 
would  lack  support  and  go  out  itself. 

To  facilitate  the  construction  of  complex  dependency  structures  the  system 
includes  a  primitive  for  creating  an  assertion  which  has  no  justification.  This 
primitive  is  called  ass{«tio«  and  takes  a  single  argument,  the  assertion  to  be  created. 
It  returns  the  fact- name  of  the  new  assertion.  For  example,  evaluating 

(AtiarHan  '(M*ck«r  Mo»1«)) 


For  Complex  Program  Understanding 


56  The  Reasoning  System 

will  cause  a  new  assertion  to  be  built  and  assigned  a  fact-name,  say  r  mL  Since  this 
new  assertion  has  no  valid  justification  (it  has  no  justification)  it  is  ouL  At  some 
later  time,  some  rule  might  decide  to  give  this  fact  a  justification;  if  this  justification 
is  valid,  r  itti  will  come  in  For  example,  if  we  had  the  following  fact  and  rule: 

r  2991  (M-Hlt  Howl* ) 

(rul»  ((  f  (M-Hlt  ptrton))) 

(Atitrt  ' ( Npchtr  ptrion)  '  (HI  T- 1 1  •  f  ul  1 -#f- H»cUri  :t))) 

Then  the  system  would  assert 

r  /mi  How  it )  (Nir-<i-ruti-tr-h«cur« 

Remember  that  r  jmi  was  originally  created  with  no  justification.  Suppose  at  that 
point  another  assertion  r  »i<  was  created  and  made  to  depend  on  the  outness  of 
mil,  as  follows: 

(  l«t  ((•  (tiurlton  '(Htcktr  Howit)))) 

(Alltrt  '(  IlCtfffljl  Howtp)  '(SI  Htcktri -trt- looit  ()  (,#)))) 

[Note;  the  comma  used  above  is  an  "unquote"  which  causes  the  variable  •  to  be 
evaluated  even  though  its  inside  a  quoted  form.  Also  lit  is  a  macro  defined  in  the 
standard  MacLisp  The  first  argument  to  at  is  a  list  of  pairs  of  variables  and  forms; 
each  form  is  evaluated  in  the  enclosing  environment  and  then  each  variable  is  bound 
to  value  of  its  corresponding  form.  The  remaining  arguments  to  ur  are  forms  to  be 
evaluated  in  the  new  environment  created  by  the  bindings.] 

When  i  tit i  was  first  creaied,  it  was  out,  therefore  r-r«M  was  in.  However, 
when  the  rule  above  is  triggered  by  mn  it  executes  and  brings  r-rsii  in.  Since  r-raaa 
depends  on  the  outness  of  t  wi  it  then  goes  oul.  Conversely,  if  the  support  for  t  tm 
is  ever  remove,  using  for  example,  the  r-r*«i  will  go  out  and  r-r*M  will  come 

in.  This  allows  REASON  to  use  justifications  in  building  its  control  structures. 

I  will  now  turn  to  the  issue  of  control  within  the  reasoning  system. 


Dependency  Directed  Reasoning 


I 


4  Explicit  Control  and  The  Task  Network  57 
Chapter  4:  Explioit  Control  and  The  Task  Network 

The  traditional  weakness  of  automatic  deduction  systems  is  that  they  are  prone 
to  blind  searches.  The  room  for  exponential  explosion  is  so  large  that  even  large 
amounts  of  a  fixed  factor  overhead  are  justified  if  they  can  cut  down  the  size  of  the 
search  space. 

The  approach  1  have  followed  here  is  to  represent  all  control  of  the  deductive  * 
process  explicitly  in  a  form  which  can  be  manipulated  by  the  same  mechanisms  as 
those  which  conduct  the  logical  deductive  process  itself.  Such  an  approach  has  been 
followed  in  [DcKleer,  et.  aL  1977),  [McDermott,  1977).  This  allows  the  deductive  to 
be  self-conscious,  able  to  explain  what  it  is  doing  and  why  it  is  doing  it  Such  a 
system  can  reason  about  whether  it  ought  to  continue  to  pursue  a  particular  task,  or 
rather  abandon  it  as  hopeless  or  of  too  little  importance  to  command  further  resources 
and  attention.  A  system  which  is  explicit  in  its  control  discipline  can  exhibit 

flexibility  which  is  precluded  in  more  traditional  systems  which  encode  their  control  in 
the  state  of  procedures  which  can  not  be  examined. 

This  suggests  a  system  which  at  the  very  least  knows  what  task  it  is  attending  to 
and  where  that  task  fits  into  its  larger  goals.  REASON  organizes  its  operation  around 
a  data-sfructure  called  the  task-nct*vrk  [McDermott,  1977)  which  makes  this 
information  explicit.  The  task  network  is  represented  by  assertions  in  the  data-base 
used  to  record  facts  about  the  program  being  analyzed.  However,  the  control 
assertions  in  the  data  base  have  a  justification  structure  which  outs  them  once  their 
usefulness  has  passed 

A  simple  example  of  the  use  of  control  assertions  in  consequent  reasoning  will 
perhaps  clarify  the  discipline  used  in  REASON.  In  consequent  reasoning  the  system 
attempts  to  chain  backward  from  its  current  goal  to  sub-goals  which  interact  to  imply 
the  main  goal  and  which  are  (hopefully)  closer  to  facts  which  are  already  known 
explicitly. 

For  any  particular  goal  there  might  be  several  different  methods  for  deriving  the 
desired  fact.  A  particular  fact  about  a  list  might  be  derived  by  backward  chaining 
though  some  implication,  or  it  might  be  deduced  by  structural  induction.  In  general 
one  would  tend  to  prefer  the  simpler  method,  however,  there  are  cases  in  which  the 
opposite  would  be  the  better  strategy. 


For  Complex  Program  Understanding 


1 


58  Explicit  Control  and  The  Task  Network 

This  has  led  to  the  following  protocol.  A  goal  is  entered  into  the  system  by  calling 
the  primitive  goai-assirt.  This  takes  three  arguments:  The  first  of  these  is  the 
assertion  to  be  provea  The  second  argument  is  used  to  indicate  what  higher  level 
task  ga\e  rise  to  this  goal,  (this  is  usually  some  sub- task  of  the  symbolic  evaluator, 
but  for  simplicity  of  presentation  1  will  call  this  task  top  level).  The  third  argument  is 
a  justification  (just  as  would  be  given  to  assirt).  A  goal  created  by  6oai -assirt  should 
remain  in  as  long  as  it  is  neither  refuted  nor  satisfied  and  as  long  as  its  justification  is 
valid.  To  achieve  this,  goai  assert  creates  two  assertions  one  stating  that  the  goal  is 
satisfied,  the  other  stating  that  it  is  refuted.  REASON  builds  an  assertion  stating 
the  existence  of  the  new  goaL  This  assertion  is  given  a  justification  which  is  identical 
to  the  justification  passed  in  as  an  argument  except  that  the  new  justification  includes 
a  dependency  on  the  outness  of  the  assertions  which  state  that  the  goal  is  satisfied  or 
refuted.  The  goal  assertion  will  remain  in  until  the  goal  is  either  satisfied  or  refuted 
at  which  time  it  will  go  out. 

The  assertion  of  a  goal  is  an  implicit  request  for  the  proposal  of  methods  which 
might  be  capable  of  proving  the  assertion  If  the  particular  goal  is  of  a  type  for 
which  a  methixl  is  known,  then  the  method  is  proposed.  This  proposal  is  given  a 
justification  which  points  to  the  goal  assertion  A  special  procedure  called  the  acciptor 
is  responsible  for  choosing  the  order  in  which  the  various  methods  for  a  goal  should 
be  tried.  The  primitive  propose  metmoo  is  used  to  propose  a  method;  it  takes  three 
arguments:  an  assertion  stating  the  method  to  propose,  a  justification  for  this  assertion, 
and  a  body  to  execute  if  the  method  is  ever  accepted. 

If  the  desired  goal  is  ever  proven,  then  an  assertion  is  made  saying  that  the  goal 
is  satisfied.  If  the  negation  of  the  goal  is  ever  discovered,  an  assertion  is  added 
stating  that  the  goal  has  been  refuted.  Either  of  these  events  causes  the  original  goal 
assertion  refuted  to  go  out,  taking  with  it  all  of  the  dependent  control  assertions* 
However,  normal  fact  assertions  will  never  depend  on  these  control  assertions;  even 
when  the  control  assertion  are  made  to  go  out,  the  facts  deduced  stay  in  if  they  are 
logically  valid. 

What  makes  this  protocol  possible  is  a  mechanism  (developed  in  [Doyle,  1 977p 
which  establishes  well  defined  points  at  which  the  system  may  chose  which  method  to 
pursue.  REASON  is  a  queue  based  system  whose  main  loop  consists  of  finding  pair* 
of  rules  and  matching  facts.  At  each  iteration  one  such  pair  is  removed  from  the 
queue  and  processed,  potentially  creating  new  facts  and  rules  and  thus  new  pairs  of 
matching  facts  and  rules.  However,  at  certain  times  there  will  be  no  such  pair*  of 


Dependency  Directed  Reasoning 


4  Explicit  Control  and  The  Task  Network  S9 


rules  .ind  f.icts  to  process.  The  method  proposing  and  accepting  protocol  guarantees 
that  new  triggering  pairs  will  not  he  created  in  an  explosive  manner,  but  will  rather 
produce  proposals  for  actions  which  must  be  accepted  before  new  actions  can  occur. 
Even  though  the  tree  of  possible  methods  and  sub-goals  might  be  exponentially 
explosive,  the  system  has  the  option  of  choosing  which  branches  to  leave  unexplored. 
As  long  as  the  system  chooses  to  pursue  only  a  few  branches  at  at  time,  the  queue 
will  run  out  of  pairs  frequently. 

This  is  the  occasion  for  the  special  xccipkx  procedure  to  be  invoked.  The 
acceptor  is  a  procedure  rur.  each  time  the  queue  runs  out  Its  purpose  is  to  examine 
the  network  of  goals  and  methods  hopefully  finding  at  least  one  which  it  deems 
worth  pursuing.  The  acceptor  is  allowed  to  add  control  assertions  to  the  data  base 
and  these  are  allowed  to  trigger  rules  which  will,  in  turn,  add  assertions  to  the  data 
base.  However  until  it  accepts  a  method,  no  further  work  on  the  goals  at  hand  will 
be  performed.  This  organization,  which  is  still  being  developed,  allows  the  machinery 
of  the  reasoning  system  to  be  used  in  deciding  which  goals  should  be  pursued.  Once 
such  a  decision  is  made,  a  method  is  accepted  triggering  the  rules  which  actually  do 
the  work. 

A  simple  example  will  illustrate  the  technique  Suppose  we  want  to  prove  P  and 
we  have  (i*»uts  o  »),  (i«*ius  •  oi  and  R.  We  would  start  off  by  stating  that  we  have 
the  goa'  P-. 

*  1  (  lf"n  0  e)  (’milt) 

r  i  t  !•«>'»««  *  0)  k) 

M  >  (IfWiiil 

(COM  *SS(*T  '»  -(top  '(’roiM)) 

Since  a  goal  statement  has  been  entered,  the  system  makes  the  assumption  that 
the  goal  has  as  yet  been  neither  satisfied  nor  refuted.  Also  it  creates  a  goal  statement 
for  the  newly  created  goal  and  justifies  this  statement  as  explained  above. 


For  Complex  Program  Understanding 


I 

60  Explicit  Control  and  The  Task  Network 


(Sotufiod  »-•) 

(). 

no  jut t if lc*l ion .  thqrqfqrq  OUt 

r-i 

(■«ruu«  r-s) 

(): 

no  juttif icotion,  tKoroforo  OUt 

M 

( 6o»l  *  ( top- tool 1 ) ) 

(Sub  »o.i()  (»-»  r-s))  r  t  *  i  t  out 

m*m*s  r-s  ii  in 


Notice  that  the  goal  assertion  it  includes  both  the  fact  to  be  proved  and  a  list  of  the 
super- tasks  which  have  led  to  the  existence  of  this  goaL  Two  rule*  are  also  created 
by  60*1  *sst«t;  one  watches  for  the  goal  becoming  true,  the  other  watches  for 
refutations.  Both  these  rules  depend  on  the  fact  r ■% 


■-»  (*utt  (<  r  *))  (s*t-a»u  t-t) 

(Alltrt  ‘(SofHfloA  I  t) 

'(SiIuIkIiw  f ) ) ) 

It  (lull  ((  t  <»«t  »)))  (lof-tulo  »-•) 

(*t»*rt  f  S) 

'(MfuUtlOfl  f  ) ) ) 

The  assertion  of  the  goal  statement  is  an  implicit  request  for  the  proposal  of  methods 
which  might  achieve  the  goaL  The  various  method  proposers  now  come  into  play. 
One  obvious  method  is  backward  chaining,  finding  an  implication  whose  consequence  is 
the  desired  goal,  and  then  posing  the  antecedent  of  the  implication  as  a  sub- goal 
The  following  rule  proposes  the  backward  chaining  method  and  then  conducts  the 
proof  if  the  method  is  accepted. 

•  ■IS  (tu<«  ((  fl  (6otl  conioquont  itoct)) 

(  H  miin  oMoctOont  conioquont))) 

( ’ropoio 

■(Kotnod  M  (••clwS-CMtn  f*))  '(S-C  :fl  :f|) 

(gool -snort  onlocotfont  •(  conioquont  .  .itock)  ‘(bc-luk- pool  :  f ) ) 

(•u>»  ((  g  (lapltoi  wilqciMnl  conioquont)) 

(  h  ontocodoM ) ) 

( Aligft  conioquont  :f  :•))))) 

Notice  the  use  of  the  primitive  Propose-Method  This  is  actually  a  macro  which 
expands  as  follows: 


Dependency  Directed  Reasoning 


4  Explicit  Control  and  The  Task  Network  61 


( Propot ■  Nathed  Moth  Ju»t  body)  •>  (lot  ((:•  (Ast«rl  aoth  Jut t ) ) ) 

(»«»•  ((  b  ( Acc«pt«d  :»))) 
body)) 

The  method  proposer  above  leads  to  the  following  results: 

f-U  (Hotnod  F  t  (ItCtMrdCMtn  t-l))  ( K  Noth  f  •  II  F  •  1 ) 

The  queue  now  runs  out  smee  there  are  no  other  actions  possible.  At  this  point  the 
ac c t p too  is  invoked.  Seeing  only  one  method  available,  the  acceptor  makes  the  obvious 
choice  accepting  the  method  proposed  in  r-n. 


r  1/  ( Accor! <d  Til) 


( Ac Ctptor  f - n  ) 


The  acceptance  of  the  method  proposed  in  r-u  allows  the  method  proposer  to 
continue  its  work.  This  creates  the  following  rule  which  represents  the  inference  rule 
of  Modus  Ponens; 


■  -II  (Out*  ((  9  ( 1-oHdi  0  p|)  (••old*  »  ll  ■-!#) 

(  k  OH 

( A  a 1 1 r  t  P  (Hodul  pontna  g  h))) 

Notice  that  when  the  system  created  the  rule  i  n,  it  automatically  generated  a 
justification  for  it.  New  rules  are  created  by  the  execution  of  a  «uu  expression  which 
is  typically  nested  within  another  rule;  when  this  outer  rule  is  triggered  by  some  fact, 
the  new  rule  is  created  with  a  justification  indicating  dependence  on  the  outer  rule 
and  the  triggering  fact. 

The  body  of  the  method  proposer  also  creates  the  new  sub-goal  o  triggering  a 
series  of  assertions  similar  to  those  triggered  by  the  original  goal  e.  We  get  the 
following: 


For  Complex  Program  Understanding 


62  Explicit  Control  and  The  Task  Network 


r  •  is 

( set i if <*p  r-is> 

() 

r-ir 

(■•Tut*«  T -  IS ) 

n 

r-is 

(Co*l  0  (P  top- I*v4l )) 

(SuS-SmI  f 

-it  (t-is  r-it)) 

•  •1 

(rut*  ((  f  0)) 

( At i»rt  (Satlitiad  f  -  IS ) 

(S*t'|f*C|10fl  f ) > ) 

(Sat-«Ml  F 

•IS) 

■  -« 

<r»l*  <(  t  <i*«t  0) )  > 

(Alter!  (I*rut*d  T-IS) 

(S*Tut*ti*n  f))) 

(t*r-ts*t  r 

-»S) 

f  n 

(Method  r  -  is  (  !ackv*rd-c**1n  f  -7) ) 

(SC-Ntth  »- 

is  r  t) 

The  ««mo«  is  now  invoked  and  Method  r  n  is  accepted. 

A-71  (Acc*pt*d  »-7S)  (Accept* r  »  n) 

which  in  turn  triggers  the  rule  for  backward  chaining,  resulting  irv 


r-ti 

( S*t  t»f t*d  »-74) 

() 

»-ti 

(■*fut*d  1-74) 

() 

r  7« 

(««*)  t  (0  A  lop-Uvdl)) 

(Sc  -  IK* -foil 

(t -71 )(7 -77  7-11)) 

a-s 

(•«T*  ((  •  ( !■*»«•»  •  0)) 

(••*!*•  f-71 

■-S) 

(  A  •  >) 

( Alltel  0  (Nppul -P*n«ni  |  A))) 


At  this  point  the  necessary  facts  are  available  allowing  rule  «-»  to  run  on  the  fact  r-a. 
Thus,  we  obtain: 

ttl  0  (Hotui  **«««!» I  I -7) 

However,  o  now  triggers  the  rule  t-i  which  is  watching  for  an  assertion  satisfying  the 
goal  r  is  ( 6o*i  o  (a  top  u vii)).  This  causes  a  justification  to  be  added  to  the  satisfied 
assertion,  »  is,  which  was  created  when  the  goal  a -is  was  created: 


Dependency  Directed  Reasoning 


4  Explicit  Control  and  The  Task  Network  <3 


(  Sat lif  i»d  I  ■  18 )  ( Set ilf «ct  ion  F >f S) 

However,  the  goal  assertion  r-is  depends  on  the  outness  of  r-is  and  the  method  and 
show  assertions  »-*•  and  »  ri  depend  on  r-ts  A  quick  inspection  of  the  justifications 
will  show  that  the  following  support  structure  exists  at  this  time: 


f-is  •>>  ri»  >  r.j#  .>  f.ji  .> 

V.., 

where  a  double  headed  arrow  indicates  non-tnonotonic  support  Thus,  when  r-is 
comes  in,  all  of  the  other  assertions  go  out.  Notice,  however,  that  all  of  these  are 
control  assertions  The  fact  assertion  »•;»  o  depends  only  on  t-»  and  r-r;  it  stays  in. 
Furthermore,  »  as  triggers  the  rule  *  ;  which  represents  the  modus-ponens  deduction 
for  o  and  <  inputs  o  p).  We  obtain: 


f  -  it  P  (Motfut  P*<*#«i  f-li  I-}) 

As  before,  this  triggers  a  goal  satisfied  rule,  this  time  R-l  for  the  goal  F-9 
»  •  isittihM  i  »i  (s«ti»r*cti*«i  r-r«) 

which  causes  the  goal  assertion  »•»  to  go  out.  A  similar  chain  of  dependencies  to  that 
above  causes  assertions  r-u,  r-u,  and  rules  a-i,  »-i  to  go  out  as  welL  This  leaves  us 
with  only  the  following  useful  assertions 


For  Complex  Program  Understanding 


64  Explicit  Control  and  The  Task  Network 


»-l 

( !■»! »•»  o  e) 

r* 

( la*  I '•»  »  0) 

(Prta'it) 

r-j 

■ 

(Praam) 

r-4 

(Subgoal  P  (top- Urol  >) 

(Praam) 

M 

(SitniiM  r  ») 

(Sat'ifactm  f  t*) 

r-is 

(SiUimm  r  u) 

(s«t urpctipn  r-tl) 

t  » 

0 

f-J  f-t) 

ft* 

p 

( >w#vii  p -as  r-i) 

Of  course,  this  entire  deduction  might  have  been  achieved  more  easily  by  a 
simple  forward  chaining  rule  for  modus  ponens.  However,  I  have  gone  through  this 
detail  to  illustrate  the  steps  of  the  protocol  In  general,  blind  forward  chaining  it  a 
bad  strategy  since  it  allows  uncontrolled  deductions  to  lead  into  endless  loops. 
Suppose  that  we  added  the  following  facts  to  the  data-base: 


(lapttpi  (»> mbit  :■)  Imlif  (plvi  1  :■))) 
I) 


Then  the  modus  ponens  rule  would  trigger  infinitely  often,  filling  the  data-base  with 
assertions  of  the  form: 


(n«a*b«r  1) 

(ntMfcpr  (ptul  I  1)) 

(mOTfcpr  (plul  I  (plwi  1  I))) 


Obviously,  such  infinite  counting  chains  cannot  be  allowed  to  occur.  On  the 
other  hand,  it  is  desirable  to  allow  some  deductions  to  proceed  in  a  forward  manner. 
I  have  so  far  found  it  convenient  to  have  both  an  antecedent  and  a  consequent  modus 
ponens  rule;  however,  one  must,  therefore,  avoid  writing  implications  such  as  the  one 
above.  In  the  particular  environment  in  which  REASON  operates  one  rarely  wants  to 
state  implications  like  the  one  above  anyhow.  As  we  will  see  later,  most  of 
REASON’S  knowledge  is  expressed  in  the  form  of  descriptions  of  data-structures  using 
a  specially  designed  specification  language.  This  allows  the  knowledge  acquisition 
portion  of  the  system  to  build  rules  which  correspond  to  these  specifications  and  which 
do  not  engage  in  uncontrollable  forward  chaining. 


Dependency  Directed  Reasoning 


4  Explicit  Control  and  The  Task  Network  6$ 


It  should  be  pointed  out  that  one  of  the  distinct  advantages  of  the  regimen  of 

explicit  control  is  that  it  is  quite  simple  for  the  system  to  determine  that  it  is 

engaging  in  infinite  loops.  If,  for  example,  the  same  pattern  appears  as  a  subgoal  of 

itself,  then  the  system  can  decide  not  to  pursue  that  subgoal  by  marking  it  as  a  loop 
and  never  accepting  it.  Similarly  it  can  set  itself  limits  on  how  deep  into  a  case 
analysis  it  will  go  before  deciding  that  it's  on  a  losing  course.  Indeed,  although  our 
work  has  not  yet  progressed  this  far,  the  reasoning  system  can,  in  principle,  reason 
about  what  that  limit  ought  to  be  given  the  particular  circumstances  it  which  it 

currently  finds  itself.  This  begins  to  suggest  the  idea  of  the  reasoning  system  having  a 
"state  of  mind". 


For  Complex  Program  Understanding 


66  Explicit  Control  and  The  Task  Network 


8eotion  4.1:  Hypothetical* 

The  actual  protocol  is,  in  fact,  more  complicated  than  what  I  have  illustrated  to 
far.  The  complication  is  caused  by  the  use  of  hypothetical  particularly  in  conditional 
proofs.  As  a  paradigmatic  case  consider  the  following  problem  (I  will  omit  the 
refutation  assertions  in  this  example  for  the  sake  of  brevity): 

St*l»  (Or  X  I) 

(latMt!  *  C) 

(laalttl  I  C) 

( lap) it  I  c  0) 
to  Sho»  D 

REASON  might  attack  this  problem  in  a  manner  similar  to  that  employed  above, 
backward  chaining  from  the  goal  o  to  both  *  and  c  At  this  point  it  recognizes  that 
case  splitting  the  disjunction  <<*  «  •)  is  an  appropriate  method  This  creates  a  set  of 
conjunctive  sub  goals,  in  this  case,  and  (tapiiti  i  ®).  Each  of  these  is 

proven  by  the  standard  conditional  proof  method,  assuming  the  antecedent  and 
attempting  to  prove  the  consequent  The  following  is  an  excerpt  of  the  assertions  that 
result: 


Dependency  Directed  Reasoning 


<1  Hypothetic*!* 


67 


r-N 

(or  •  b) 

(praaaiaa) 

r-si 

< inpl lot  •  C ) 

(pranist) 

r» 

( i«pi las  b  c ) 

(praaita) 

ru 

( imp) itl  c  d) 

(prowl to) 

r  sr 

{ tatiaf lad  r -SS) 

();  Na  support,  tharafora  !'■  OUt 

r-ss 

( goal  d  ( tot  lava) ) ) 

(aubgaal  (f-47)  (f-S4)  ) 

at 

(lulo  ( ( : r  d)) 

(Assart  ' ( tatiaf ltd  r -SS) 

' (Sot 1- Sat  :f ))) 

(gaat-aat  f -SS) 

r-ss 

(••thod  r-ss  (splitting  rjt)) 

(awthad  f-SS) 

r-s; 

( accaptad  r-ss) 

(accaptor  r-SS) 

r-tt 

( rat  nr  tad  r  -  i«t ) 

;  Na  support,  tharafora  I ’at  OUt 

r-iss 

(goat  (i»pliai  a  d)  (d  ( top- lavtl  ( ) ) 

(aubgaal  (f-M)  (f-»l)) 

it 

(Putt  ((  f  (  «ap  1  tai  a  d)) 

(Assart  '( tatiaf  ltd  r-lM) 
(Coal  Sat  : f ) ) ) 

(gaal-aat  f-lN) 

r-its 

(•at hod  f-lM  ( standard- lap! icatian) ) 

(si-aathod  fill) 

r-tis 

(accopttd  f  -  IPS) 

(accaptor  f  -IIS) 

r-u* 

a 

(sap-attw  (f-l)l)  (f-lll)) 

r-us 

(■ot  a) 

() 

rut 

c 

(a*  r-u  r-m) 

r-its 

( tatiaf  tad  f  Its) 

().  Na  support .  tharafora,  I'ai  OUt 

r-m 

(goat  d  ((inputs  a  d)  d  (tap- lavtl ))) 

(aubgaal  (r-lll)(r-US)) 

s-i 

(«ult  ((  f  d)) 

(gaal-aat  f-lf*) 

(Ntttrt  '(iltllftll  f-Wt) 
'{toil-Sil  :f ))) 


r-ttr  d  (-*  r-ii  r-ut) 

Notice  the  rules  «  i,  i  i  and  •  i  which  correspond  to  the  goals  r-ss,  r-iss  and 
r-m.  Two  of  these  (•- 1  and  n  i)  are  waiting  for  the  same  fact  (r-u;  o  )  to  come  in 

at  which  time  each  will  rule  will  assert  that  its  goal  (r-ss  and  r-m  respectively)  is 

satisfied.  This,  however,  is  an  obvious  mistake.  If  r-u;  o  comes  in  at  this  stage,  this 

does  not  imply  that  o  is  true,  only  that  o  follows  from  the  assumption  k  Thus,  the 

goal  r-ss  should  not  be  satisfied  by  this  occurrence  of  a 


For  Complex  Program  Understanding 


68  Explicit  Control  and  The  Task  Network 


The  classic  solution  to  this  problem  in  AI  systems  such  as  PLANNER 
(Hewitt,  l<J72)  CONNIVER  (McDermott  1973],  and  QA4  (Rulifson,  et  aL,  1972]  is  to 
use  a  context  mechanism  to  represent  the  “echelon"  in  which  the  implication  will  be 
derived.  Typically,  a  new  context  is  created  in  which  the  assumption  *  as  well  as  the 
new  goal  t>  (or  its  analogue)  are  asserted.  When  the  fact  o  comes  in,  the  satisfied 
assertion  is  added  to  the  new  context  which  is  then  discarded;  only  (itwues  *  o)  is 
returned  to  the  old  context  The  problem  with  this  approach  is  that  it  is  altogether 
possible  that  the  fact  o  derived  in  this  new  context  might  not  depend  on  the 
assumption  »,  in  which  case  the  system  ought  to  assert  that  the  main  goal  o  is  satisfied 
and  terminate  the  process;  the  context  system  is  incapable  of  doing  this  since  the 
contexts  do  not  represent  logical  dependency  but  only  the  chronology  of  the  problem 
sober's  behavior.  Lacking  any  representation  of  logical  dependency,  a  context  system 
becomes  overly  rigid. 

REASON  instead  uses  the  dependencies  maintained  by  the  truth  maintenance 
system  as  well  as  the  explicit  control  assertions  to  guide  itself  to  appropriate 
conclusions  Goal  assertions  contain  within  them  a  goal  stack,  indicating  the  chain  of 
subgoals  which  led  to  the  current  goal.  Although  each  such  assertion  includes  a 
(linear)  stack,  the  set  of  all  such  assertions  is  a  (potentially  non-linear)  network,  since 
the  same  goal  may  be  reached  by  several  different  paths  The  stack  is  included  in 
goal  assertions  for  two  reasons;  First  it  allows  two  nested  occurrences  of  the  same  goal 
(such  as  D  above)  to  be  distinguished  by  their  goal  stacks.  Since  the  two  goals  are 
represented  by  different  assertions  we  may  easily  say  that  only  one  is  satisfied  while 
leaving  the  other  to  remain  as  an  active  goal 

The  second  reason  for  this  representation  is  connected  with  the  use  of 
hypotheticals  such  as  the  assumption  A  made  in  trying  to  prove  (Implies  A  D).  The 
goal  assertions  actually  used  in  REASON  are  an  extension  of  those  I  have  shown  so 
far,  including  not  only  the  goal  stack,  but  also  a  set  of  assumptions  made  as  part  of 
the  deduction  procesv  These  assumptions  are  referred  to  as  the  comexi,  although  this 
context  should  be  distinguished  from  that  of  CONNIVER  or  QA4,  in  that  the 
assumption  set  is  an  unordered  set,  while  CONNIVER  and  QA 4's  contexts  are  strictly 
nested.  Every  time  a  new  assumption  is  made  for  the  sake  of  hypothetical  reasoning 
(as  m  proofs  of  implications  or  in  indirect  proofs)  the  newly  assumed  fact  is  added  to 
the  context  part  of  the  goal  assertion  associated  with  that  assumption  To  facilitate 
this,  the  primitive  &o*i  »sst«t  takes  one  more  argument  than  shown  above,  namely  the 
assumption  context  Thus,  the  same  set  of  assertions  as  shown  above  will  now  be 
represented  as  follows; 


Dependency  Directed  Reasoning 


41  Hypothetical* 


69 


r-N 

(or  o  b) 

(praaisa) 

r-ii 

(  i*>p1  las  a  c ) 

(praaisa) 

r-)t 

( implies  b  c ) 

(praaisa) 

r-u 

(  imp!  ioi  c  d) 

(praaisa) 

r-sa 

( set  iff  tod  f  -  SS) 

().  to  support,  therefor*  |*a  OUt 

r-ss 

(goal  d  far  (top-laval)  in  ()) 

(subgoal  (f-«7)  ( P -  54  ) ) 

a-l 

(tula  ((  f  d)) 

( At i«rt  '(satisfied  r-SS) 
'(Seal -Set  :f ))) 

(goal-sat  f-SS) 

r»> 

( *at  hod  F  -SS  (iplitting  f  ■  St)) 

(aathod  F-SS) 

r  •*; 

( accept'd  t  65) 

(accaptar  F-6S) 

r-tt 

<  tatiif iad  1  III) 

;  te  support,  tharafera  l‘a  OUt 

f  -  1M 

(  goal  (  implies  a  d) 

f o'  (d  (top-laval))  in  ()) 

( subgoal  (F-74)  (f-M)) 

•  -I 

(butt  ((  f  ( implies  a  d)) 

(Assart  ‘(satisfied  F-ltl) 

'(Soal-Sat  f ) ) ) 

(goal -sat  F-iil) 

t-MS 

(method  >111  ( s tandartf - impt icat ion) ) 

(si  aathod  F-166) 

f-ua 

(  accepted  f - ItS) 

(acceptor  F •  Its ) 

r-m 

• 

(tapes  sump  (f-U6)  (f-111)) 

r-ui 

( "at  a) 

();  la  support,  tharafora  fa  OUt 

r-m 

c 

(ap  F  it  f-Uf) 

Firs 

(satisfied  f  176) 

().  to  support,  therefore.  !‘a  OUt 

t-m 

(goal  d 

far  ((implies  a  d)  d  (top-iaval)) 

fa  (a)) 

(subgoal  (F-UD(F-IIS)) 

a-s 

(tula  ((  f  d>) 

(Assart  ‘(satisfied  f -176) 

‘ (Seal -Sat  f ) ) > 

(goal -sat  f - 17*) 

p-lif 

d 

(ap  F-ll  F - 177) 

In  goal  assertions  the  keyword  "for"  indicates  the  subgoal  stack  while  the 
keyword  "in"  indicates  the  assumption  context  When  the  fact  r-ii?  o  comes  in  now, 
it  is  possible  to  determine  which  goal  it  actually  satisfies.  The  rule  is  as  follow*: 


For  Complex  Program  Understanding 


70  Explicit  Control  and  The  Task  Network 


Given  an  assertion  P  which  triggers  the  pattern  of  a  rule  which  is  watching  for  the 
satisfaction  of  some  goal 

1.  Request  the  Truth  Maintenance  System  to  prepare  a  list  of  all  assumptions  which 
support  the  satisfying  fact  P. 

2.  Fetch  all  goal  statements  whose  goal  matches  the  satisfying  fact  P. 

3.  For  each  goal  assertion  test  whether  the  assumptions  found  in  step  1  are  a  subset 
of  the  assumptions  listed  in  the  goal  assertion's  context  list 

4.  Discard  those  goal  assertions  which  fail  the  test  in  3. 

3.  For  each  of  the  remaining  goal  assertions  in  4  assert  that  the  fact  P  satisfies  the 
goal  assertion. 

This  procedure  results  in  the  following  assertion  when  applied  to  the  situation 
described  above- 


r-wa  (  nt  nf  >»d  F-m>  (90«l-found  f - 1(7  f-lll) 

However,  REASON  does  not  assert  that  the  original  goal  assertion  r-ss  o  is  satisfied 
since  it  has  an  empty  assumption  context  while  the  assertion  r-ur  depends  on  the 
assumption  r-nr  ». 

This  process  is  driven  by  the  rules  which  watch  for  the  presence  of  goal- 
satisfying  facts,  yet  these  rules  in  the  current  system  are  required  to  be  stated  in  a 
more  procedural  manner  than  desired  The  algorithm  we  have  just  stated  is  one 
which  determines  whether  a  certain  pattern  of  dependencies  obtains  and  acts  only  in 
that  case.  Given  the  philosophy  of  explicit  control  in  a  rule  based  system,  it  would  be 
preferable  to  include  such  dependency  patterns  in  the  triggering  list  of  a  rule,  simply 
allowing  it  to  run  whenever  the  appropriate  combination  of  support  and  facts  obtains. 
Unfortunately,  the  current  mechanisms  are  considerably  too  weak  to  implement  these 
desiderata  and  we  are  forced  to  employ  more  awkward  mechanisms.  What  one  would 
like  is  to  be  able  to  write  something  like  the  following 


Dependency  Directed  Reaaoning 


4. 1  Hypothetical*  7 1 


(Rut*  ( ( : f 1  (go*1  g  for  dock  in  conttut)) 

{■ft  g) 

(t  (tubiot  (tiiiwt <*«- support  ft)  centgit ))) 
(OkStrt  '(lotllflod  :  f  l )  '  (  god  found  f ! ) ) ) 


where  the  intention  is  to  treat  the  third  clause  as  if  it  were  a  fact,  triggering  the  rule 
whenever  the  condition  expressed  in  this  clause  is  true.  This,  however,  puts  a  burden 
on  the  truth  maintenance  system  to  not  only  be  aware  of  changes  in  the  in  and  out 
statuses  of  facts,  but  also  to  be  aware  of  more  complex  conditions,  signalling  these  to 
the  reasoning  system  as  well 

This  poses  an  interesting  research  direction  for  future  work  which  1  have  not  yet 
pursued.  Is  it  possible  to  develop  a  lexicon  of  such  useful  justification  patterns  and  to 
extend  the  truth  maintenance  system  to  support  the  facility  just  outlined? 

The  partial  solution  I  have  adopted  is  to  have  a  special  kind  of  rule  called  a 
trigger-  rutv  which  runs  each  time  all  its  pattern's  come  in.  The  body  of  this  rule  is 
then  free  to  perforin  further  checks  to  determine  if  it  wants  to  proceed;  in  particular, 
if  can  investigate  the  support  patterns  of  various  assertions.  If  these  patterns  of 
support  are  appropriate,  the  rule  can  then  add  assertions  to  the  data-base.  If  not,  it 
can  simply  exit;  however,  if  the  triggering  patterns  of  the  rule  all  become  in  at  some 
later  time,  then  the  trigger  rule  will  execute  again,  allowing  the  support  pattern  to  be 
checked  once  more. 

This  treatment  of  trigger  rules  is  different  from  normal  rules.  When  a  normal 
rule  matches  a  set  of  facts,  it  is  run  on  this  set  of  facts  exactly  once;  the  first  time 
that  all  the  facts  are  in.  Normal  rules  are  concerned  with  truth;  they  implement 
deductions  which  are  thereafter  maintained  by  the  truth  maintenance  system.  When  a 
rule  runs,  its  job  is  to  create  new  facts  and  to  provide  the  TMS  with  a  justification 
for  each  of  these.  As  we’ve  seen,  a  justification  consists  of  two  sets  of  facts  the  in 
antecedents  and  the  out  antecedents  These  later  are  used  only  in  assumptions  The 
TMS  views  justifications  as  permanent  implications;  the  mness  of  the  in  antecedents 
together  with  the  outness  of  the  out  antecedents  implies  the  mness  of  the  consequent 


For  Complex  Program  Understanding 


72  Explicit  Control  and  The  Task  Network 


Thus  a  rule  concerned  with  truth  need  only  run  once  on  any  set  of  triggering 
facts  since  the  pattern  of  support  it  creates  is  eternal  and  can  be  handled  by  TMS 
without  further  executions  of  the  rule  Trigger  rules  are,  in  contrast,  concerned  with 
utility  and  control,  concepts  of  much  greater  plasticity;  they,  therefore,  require  the 
additional  flexibility  of  executing  any  time  their  trigger  facts  change  status  to  in. 

Once  the  trigger  rule  has  executed,  REASON  concludes  that  the  implication 

(lopiiaixn)  is  proven.  This  is  justified  by  a  conditional  proof  justification;  the 

computed  support  is  independent  of  r-m  and  includes  only  r-n  *  c)  and  r-n 

|l«cll||  C  0  >. 

REASON  moves  on  to  the  second  half  of  the  case-split.  This  proceeds  in  exactly 

the  same  manner  as  above.  In  this  half,  r  n#  a  is  assumed  which  leads  to  r-m  c  by 

Modus  Ponens.  The  TMS  already  has  a  justification  which  says  that  if  r-m  is  in, 
then  r  in  should  be  in  as  well;  therefore,  is  brought  in  and  the  trigger  rules  run 
again.  The  trigger  rule  for  the  first  side  of  the  split  is  out  since  it  depended  on  the 
goal  r  »  which,  since  it  is  satisfied,  is  out,  the  trigger  rule  for  the  second  side  of  the 
split  and  for  the  main  goal  i  ss  are  in.  This  time  they  conclude  that  the  new  sub¬ 
goal  o  is  satisfied;  therefore,  (iritis  *  o)  is  proved.  Its  conditional  proof  justification 
computes  that  the  support  includes  only  i-jr  (uwiitsic)  and  r-ii  (ineutsc  o).  The 
main  goal  f  ss,  however,  still  cannot  be  satisfied,  since  r- ut  o  still  depends  on  r-ist  a, 
an  assumption  not  in  r  ss's  context  set 

However,  the  case  analysis  is  now  completed;  mu  o  has  been  derived  from  both 
sides  of  the  split.  If  is  asserted  with  its  justification  pointing  at  the  disjunction  used 
for  the  case  analysis  and  the  implications  proven  on  each  side  of  the  split  The 
trigger  rule  associated  with  r-ss,  the  main  goal,  is  still  in  and  is  triggered  once  agaia 
This  time  it  succeeds  since  the  current  support  of  i-w;  includes  no  assumptions  at  all, 
but  only  the  facts  i-m,  r  -  si,  r-j;,  r-u 


Dependency  Directed  Reasoning 


4.2  The  Rules  Of  Inference  73 
Section  4.2:  The  Rules  Of  Inferenoe 

In  the  following  pages,  1  will  present  the  rules  of  inference  currently  used  in  the 
experimental  system  These  should  not  be  regarded  as  finished  products,  but  as 
experiments  along  the  way.  Similar  rules  may  be  found  in  [Doyle,  1977}  These  rules 
are  essentially  the  normal  rules  of  standard  logic  embedded  within  the  discipline  of 
explicit  representation  of  control  information  It  should  be  noted  that  these  rules  have 
the  property  that  no  non-control  assertion  will  ever  depend  on  a  control  assertion; 
thus,  the  logical  validity  of  normal  assertions  is  independent  of  the  control  regime. 

The  reader  should  also  note  that  I  do  not  show  here  those  rules  which  are 
responsible  for  selecting  among  competing  methods  These  are  still  under 
development. 

Contradiction  Signalling 

signal  a  contradiction  to  the  TMS  if  both  a  fact  and  its  negation  are  in. 

(■«»•  ((  r  o) 

(  f  (not  p > > > 

(o»»»M  *(*"0  P  (not  p))  ' (contradiction  f  p))) 


Double  Negation  Simplification  (Not  Elimination) 

(■ule  ((  t  (Hot  (not  p  ) )) ) 

(••tort  p  • ( doub’0 -nogat ion- tl  <•  f  ) ) ) 


»  v  l  'triple v  Program  Understanding 


74  Explicit  Control  and  The  Task  Network 


Antecedent  Use  of  If-Then-Else  (If  then-else  elimination) 

If  the  antecedent  of  an  tr-ma  iLU  is  in  assert  the  fact  for  the  ma  branch.  If  the 
negation  of  the  antecedent  is  in,  assert  the  fact  for  the  list  branch. 

(rut*  (J:f  (if  e*t>d  th*n  tru*  (lit  fill*))) 

(rul#  {(  f  cor*)) 

(•ll*rt  true  :f  :|))| 

(fu>t  ((  s  ("•»  c*R<))) 

(*«t*M  fall*  '(If-thm-ilii-filii  :f  ()))) 


Modus  Ponens  (Implication  Elimination) 

(rul*  ((  f  ( «m*I1*i  •  »)) 

(•  •)) 

t  •»»•*■«  b  (•*  f  »)))) 


Quantifiers 

The  quantifiers  used  in  REASON  are  slighth  different  than  those  of  normal  logic 
SNsteins.  The  universal  quantifier  is  a  restricted  quantifier  with  the  force  of  no 
implication,  for  example: 

(for-blt  (  ■  )  (twvfctr  I *•  •*bl«  i 

( r»c  •  ( t*jr  f  «'t  •  ‘  *y- 1  I 

which  fates  that  no  member  ol  the  table  has  key  equal  to  key-1. 

I  he  existential  quantifier  is  .ilw:i> coupled  with  a  such- that  clause  which  has  the 

forte  ot  a  i  onjum  tion.  eg. 


f  'e  f  t  f  tHw-f 

•  *>'*  ’‘ai  'If/PtM  a  »•*  1  ‘ 

E;u  h  '  fled  statement  includes  a  list  of  variables  bound  by  the  quantifier,  the  first 
cl  m.c  Ot  either  'ype  of  quantified  statement  must  mention  all  variables  in  this  list;  the 

se»  ond  Jau  c  m.i\  not  h  oe  an\  free  variables 


Dependency  Directed  Reasoning 


4.2  The  Rule*  Of  Inference  75 


Antecedent  Use  of  Universal  Quantification  (Universal  Elimination). 

If  an  object  exists  which  matches  the  antecedent  of  the  toa-au,  then  assert  the 
consequent  clause  (with  the  bindings  substituted  in). 


(rulq  ((  t  (for  ill  vim  p  q)) 

(9  P>) 

(lltirt  q  ‘(for-lU  :f  :()))) 

For  example,  if 

(fqr-qll  (  «)  |»l»litr  >) 

(thqrq-M  {  y)  (lupllit  Mtl-l  y) 
ivch-thtt  (tint  :y  :■))) 

is  matched  against 


()Mtr  1l»t-l  OPjltl  -ll) 

we  would  then  eliminate  from  the  universal,  obtaining: 

(  Tlvqpq .  it  (  y)  (lutllll  lut-l  y) 

(flol  y 


For  Complex  Program  Understanding 


76  Explicit  Control  and  The  Task  Network 


Universal  Introduction 

(m'«  ((  f  (goal  (for-ptl  :««rt  p  :«)  far  g««l  in  CMtiit))) 

( propox  Mthod 

'  (Molhod  f  (typical  MXttr))  '(UHW  :f  > 

(tat  ((mbit  (ui-bwild-iubi  *art))) 

(tat  ((  n««p  (mitanca  p  mbit)) 

(  navq  (  mitanca  a  mbit))) 

(  goal  •  anart  ‘(mpliai  :na«p  nt*g) 

'((rpr-all  :»ar*  p  :g)  .  :gaat) 
cantaat 

'(Ur -at I -tub  goal  f)))) 

(rule  ((  g  (IWflltl  n««p  n««p))) 

(anart  '(far-all  :vari  :p  :q)  '(wl  :*))))) 

Where  ui  eutut -suss  binds  each  variable  in  vms  to  a  newly  created  anonymous  object 
(see  next  section).  An  anonymous  object  is  one  whose  identity  is  unknown;  it  is  a 
prror.  impossible  to  tell  whether  an  anonymous  object  is  identical  to  another  object 
eotui  s  ss  also  marks  each  anonymous  object  it  creates  as  a  ui-osatcr  (using  the 
object’s  property  list);  this  mark  is  used  by  the  existential  introduction  and  elimination 
rules  to  prevent  certain  logical  bugs  explained  below. 


Dependency  Directed  Reasoning 


4.2  The  Rules  Of  Inference  77 

Expansion  of  Existential  Quantification  on  Demand  (Existential  Elimination) 

The  reasoning  system  occasionally  will  want  to  expand  an  existential,  replacing  the 
variable  of  the  quantified  statement  by  an  anonymous  object.  However,  if  the 
existential  contains  ui  osjicis  (objects  created  for  Universal  Introduction)  then  the 
anonymous  objects  created  are  marked  to  show  their  dependency  on  the  ui -objects. 


(rul*  ((  f  (  •■Mt*  v*r,  fact  lucbthpt  p 

(rulp  ((  g  f ) ) ) 

<  1  •  t  ((  tubtt  (•*  build mbi  vpr i  prtd  :f»tt))J 
(  1  •  t  ((  "Px-fbCt  (iMtliKI  fKt  mbit)) 

(  n*»  pr»d  (  tnitpnc*  pr»#  :iublt))) 
llllirl  npv-fpet  '(thtbttJ  :  f  ) ) 

nex-p^tb  f)))})) 

where  ti -build-subs  creates  a  substitution  which  binds  each  of  the  variables  in  v*«s  to 
a  newly  created  anonymous  object.  ixsuxct  is  a  function  which  substitutes  these 
bindings  into  its  second  argument.  Expand  assertions  such  as  referred  to  in  the  above 
rule  are  asserted  occasionally  during  symbolic  program  evaluation. 

As  mentioned  ec-buho-su&s  marks  the  ncwlv  creaicd  anonymous  objects  for  their 
dependence  on  mi  os'icis.  Tor  example,  if  objicm  is  a  ui  ob.<ct  and  we  have  the 
following; 

(Tbprp-n  (  y)  (*»-&«•'  efcj'Ct-I  y)  iuOt-tHtl  (*ty  j  *«/-!)) 

and  existential  elimination  create-  an  anonymous  object,  and  substitutes  this 

for  v,  then  ri-iutie-s  »•.  will  mark  »v>»i»  i  to  show  ns  dependence  on  object  i.  We 
will  obtain  the  assertionv 

( n»mv»r  ob  )»c  t  i  rntmtirr  - 1  ) 

(K*y  lpy-1) 

and  the  property  li«t  of  i  will  be  marked  so  that  the  ui-otet«oi»cv  property  of 

is  the  list  r^-wet-n 


For  Complex  Proffam  Understanding 


78  Explicit  Control  and  The  Task  Network 


Existential  Introduction 

(rul*  ((:»  (goal  (thart-ta  :>«ri  p  svch-that  a)  f ar  :gaal  1"  :conta»t))) 
(ProposaHathod 

•  { Ha  t  hod  : f  (Standard  (I ))  ’  (C ISM  :f) 

(goat  amrt  p 

'((thara-ti  :»ara  :p  »uch  that  q)  .  goal) 
contact 

'(thara  i»-iub  goal  I  :f)) 

(r«la  (( :g  p)) 

(cond  ( ( ut -ob jac l - f rat  p  :*ari) 

(goat-anart  q 

*(:P  #«•>) 

cantaat 

' ( thara- ta-luP  goat  I  t  g)) 

(  ru'a  ( (  :  ►»  q)) 

(Cond  ((ui  objactfraa  q  aqri) 

(atiart  '(tharai|  »arg  p  luch  that  q) 
'(•mt-iotfa  n  g))))>))))> 


The  function  ui-o*Jtct  t»tf  checks  to  see  whether  the  matching  has  resulted  in  a 
triable  of  either  ?  or  o  becoming  bound  to  an  expression  containing  an  anonymous 
object  which  is  marked  with  a  UI  DFPF.NDENCY  property.  If  so,  this  expression 
cannot  be  used  to  introduce  an  existential  This  prevents  the  classic  mistake  of  using: 

(for  alt  (  "t  (objact  »)  (thara  u  (  j )  (objact  /)  »uch  that  (P  ■  /))) 

to  conclude^ 

(  th#r»  '  jr)  (objact  p)  luch-that  (far-all  (  a)  (objact  •)  (P  «  jr )  > ) 

(where  u*jici  is  a  predicate  true  of  everything,  and  is  used  merely  to  fill  the  first 
position  of  my  restricted  quantifier  notation). 

There  .ire  some  delicate  issues  of  control  involved  in  simultaneous  sub-goals  which 
share  variables.  These  come  up  in  proving  existential  quantifiers  like  the  ones  used 
here.  [Doyle,  1977]  discusses  these  problems  and  presents  a  more  advanced  solution 
than  I  have  used  here. 


Dependency  Directed  Reasoning 


4.2  The  Rules  Of  Inference 


19 


Expansion  of  And  (And  Elimination) 


(  rut*  (  :  f  {  »nd  p  :q) ) 

{•Start  p  '(•nd-*)t»  :  f ) ) 
(•lift  q  ■(•nd-»ll'»  ())) 


Disjunctive  Elimination  (Or  Elimination) 


This  is  dour  in  two  parts.  When  a  disjunct  is  asserted,  it  is  put  into  an  expanded 
form  so  that  rules  can  easily  determine  whether  a  particular  fact  is  a  clause  of  the 
disjunction.  Then,  if  a  clause  in  a  disjunction  is  ever  negated,  a  new  disjunction  can 
be  asserted,  including  all  the  other  clauses  of  the  old  disjunction. 

Set  Up  Expanded  version  of  Disjunct 


( rut*  ( (  f  (or  <))) 

(do  ( (d »»  «  ( edr  di» ) ) ) 

((null  dii|| 

(tot  ((  d»i  (dnjuftcl-ar  f  .(c»r  tfii)))) 

(■tiort  d'i  '(ipr«*4-d«ijv*cti  '))))> 

The  double  quote  (")  is  a  macro  which  produces  a  list  The  items  inside  the  list  are 
not  evaluated  unless  preceded  by  a  comma  (,).  Variables  (e.g.  if)  are  always  evaluated 
li  t  contains  two  parts  a  set  of  binding  expressions  and  a  hody.  Each  binding 
expression  contains  a  variable  and  an  expression.  The  variable  is  bound  to  the  value 
of  the  expression  The  Sod>  is  executed  in  the  enuronment  created  by  these  bindings. 


Do  The  Actual  Work  When  Appropriate 

(rut*  ((  t  (d’ljuntt  of  d  p)) 

(  a  ( no!  p ) ) ) 

(  *•  t*r  t  f  or  ,( l*f * -d*l*l*  P  '  dl)  ’  (d'l  JU"ct  1»**l  f  ())) 


Eor  Complex  Program  Understanding 


80  Explicit  Control  and  The  Task  Network 

The  exclamation  point  character  (!)  is  a  macro  which  when  applied  to  a  fact  name 
produces  a  fact  statement  Le  !=d  =  (Or  ...  )  in  the  above,  sati-mutc  is  a  variant  of 
the  built-in  LISP  function  otiiu  which  does  not  side-effect  its  argument 


Simplify  singleton  or 

If  the  above  leads  to  a  disjunction  with  only  one  clause,  that  clause  may  be  asserted 

I'u't  ( (  *  ( or  #)) ) 

(•»««•!  p  (•r-ttagli*  O)) 


The  Basic  conjunctive  goal  mechanism: 

I  his  is  essentially  •»  sub-routine  called  by  other  strategies.  The  routine  will  try 
each  of  the  sub-goals  in  turn.  A  later  sub-goal  will  not  be  tried  until  the  prior  sub- 
go.tl  is  satisfied  A  refutation  will  stop  the  iteration,  asserting  that  the  calling  goal  is 
refuted.  Thus,  this  should  be  called  only  when  the  conjunction  of  the  sub-goals  is 
equivalent  to  the  calling  goal. 

I  he  call  is  made  by  asserting 

(conjunctive,  got  >  I  Mr»t  roit  «»»  >!•«»  conUit) 

where  the  arguments  ha\e  the  following  meaning 
1  first  the  first  sub-goal  to  attempt. 

1  rest,  the  remaining  sub  goals  to  attempt  after  the  first  is  satisfied, 
t  dep.  previousL  accumulated  assertions  upon  which  the  final  goal  will  depend.  Each 
time  i  sub -goal  is  satisfied,  the  satisfying  fact  is  added  to  this  argument. 

4  stack  the  goal  stack  with  which  the  invocation  was  made.  The  first  item  on  this 
stack  ts  rhe  immediately  dominating  goal.  If  all  the  sub-goals  arc  satisfied,  we  can 
conclude  that  the  first  element  on  the  stack  is  satisfied 
V  context  the  assumption  context  in  which  this  is  invoked 


Dependency  Directed  Reasoning 


4.2  The  Rules  Of  Inference  81 


Conjunctive  Sub-Goal  Mechanism 


(rul»  ((  t  (conjunctiva  goal*  f'r»t  roll  :d*p 

(  tor  contoit ) ) ) 

(I*ll|inrt  firit  '(  top  .  :  stack  >  ;  rontoit  ‘(conjunctive  gooli  f)) 

(ru't  ((  g  not  f  \r»t))) 

(Snort  Sot  tor)  '(Conj-jo*' -rofutotlen  g))) 

(  rul#  {  (  g  '’•rot  )) 

(trod 
(  :r»»t 
(  •  n«f  t 

!  con;  -..net  1  VO-gSt'  I  ICir  fflt)  (cur  rtit)  (  ;g  ,  pop) 

( : top  .  itoct )  :conto>  t  ) 
i  c.ooju'-i  t  ’v»  goo'  control  '  h))) 

(t 

( oooprt  top  1 f  con June  t '*0  goill  g  Sop)))))) 

Proof  b>  cases 

Spin  a  disjunction,  treating  a  case  analysis  In  each  case  attempt  to  show  that  the 
current  clause  of  the  injunction  implies  the  desired  goaL  This  is  done  by  creating  a 
sub-goal  which  is  the  conjunction  of  these  implications  Conjunction  Introduction  will 
invoke  the  contunctive  goal  mechanism  to  conduct  the  proof 

(rulo  ((  g  (pool  r  *or  go«l  in  contort)) 

{  h  (or  q))) 

(krcPoir  **M»nd  (Hotrod  5  (iptiltinp  h))  1  ( Sp*  It -H«  Ih  :g  h) 

('•t  ((  qr  iond  .(appear  ‘(iwrt'a*  («)  '(tnpliot  .■  ,:p))  q))) 

(gool  oittrt  qf  ' (  p  goal)  conto.t 

(u»»  conj  gool 1 -for- cot* • ipllt  g  :h)))))) 


For  Complex  Program  1'nderst.tndmg 


82 


Explicit  Control  ami  The  Task  Network 


Implication  Introduction 

(rulo  \  f  (goal  •  b )  for  goal  la  :<»nta*t))) 

(F'<a  cia-Uc'.r-od  'If**  thod  f  ( Slandard. Iwp  Jatro))  '(ill  f) 

(goal-aitaM  6 

• ( ( i»p 1 iti  a  b )  .  goal ) 

'(  a  (.oatart) 

'  I  ' <*>p  1  it*  lubgoal  f  I ) 

(aiauaa  a  •  (  a  t  a«dar  d- i»p  i  ’ ai  rulai  attuaglion  f)) 

(fola  ((  a  :•) 

(  9  b)) 

(•(•art  '('•V'’**  •  *>)  Mcp  9  (  '')))))) 


Disjunction  Introduction 


(•via  ((  f  (goal  (Or  d)  for  goal  in  :<»rtnl)|) 

fr-opcia  "(Ibod  ("alhod  r  (ilaadard  dia-iatra))  ‘(SOI  :f) 

(do  ( ( d  >  t  d  ( cdr  di» 1) 1 
({•dll  d >» ) ) 

(lal  ((  car-d'a  tear  0<t))) 

( goal -aiaar t  on  ’((Or  d)  .  goal)  coatoil  '(biaj-latro  :f)) 
(•ula  ( (  g  ■  c  of  - •  1 1 )  I 

(Anart  (Or  0)  '(011-IMra  ’•>>>)>)) 


Conjunction  Introduction 


(•via  l(  f  igoal  (ind  tf  t')  for  joa*  la  coataat*)) 

(Oropoit  Hathod  (Nathof  f  (lUada'd  coaj-tatro))  ’ i iC I  :f) 

(»»*ar| 

'(co« jonc I ’ »a -goat a  cf  cr  (t  ( ( Xnd  tf  cf)  goal)  contad) 
(Uao  tonj  goa'i  for  a*>0  -iatro  f)))) 


Dependency  Directed  Reasoning 


4.2  The  Rules  Of  Inference 


If-Then  Else  Introduction 

(rutd  ((;f  (float  (If  p  than  q  «lir  :r)  to <■  :goal  in  :conta«t))) 

(Propei a -Hethod  '(Hathod  :t  (standa-d  1-t-p))  '(SITU  : f ) 

( Altort 

•  (  con  t  ic  •  ■  goal  i  ( tap)  tai  p  q)  ((napllai  (not  »  )  r))  () 

((If  :p  tnan  q  alia  r|  .  goal)  contact) 
'(u«r  con j -goal t ■ for  1-t-q  :f)))) 


Backward  Chaining 


( rulf  t(  f  Iqoa'  q  for  goal  ’n  contait)) 

(g(impl<ai  p  q I ) ) 

( Propoi* -Method  '(Method  :f  I  backward' chaining  gl)  '(|C  :f  g) 
(goal-mart  p  '(  q  goal)  contait  '(backonrd-cba'n  f  g)))) 


Backward  Chaining  Through  if-then  Else 


(®ute  ((  '  (goal  q  for  goal  in  contait)) 

(  g  (If  p  than  q  a  1 1 a  . r ) ) 1 

(Propoao  Method  * (Motnod  '  (' f  thar  back  c'»'«  :g))  (Hide  f  :g) 

(goal  mart  p  f  q  geo' )  -contait  '(i  t  »  back-cham  f  gl))) 

(Aula  f(  '  (goat  r  for  goal  In  contait)) 

(  g  (  I  f  r  IK  q  oli »  .  r ))  ) 

(  Pr  opoir  Ha  t  nod  '(Method  ;f  (if  a '  ia  back-cham  g)l  (1T(8C  f  g) 

(goal  aitart  ’(not  p)  M.r  goal)  contait  (i  t  a  back  chain  f  :g)))) 


Indirect  Proof 


(•ulq  ((  f  (goal  o  for  goal  in  coMttt))) 

(Preroaa  Method  '(Method  .f  (indirect-proof))  *(lf  f) 
(Alim*  (hot  p)  '(Indirect  proof  aiiuapt'on  f ) ) ) ) 


For  Complex  Program  Understanding 


84  Explicit  Control  and  The  Task  Network 


If  this  leads  to  a  contradiction  and  if  this  contradiction  actually  depends  on 
assumption  of  (»ot  p>,  then  the  truth  maintenance  system  will  determine  that 
assumption  is  responsible  for  the  contradiction.  It  will  then  bring  in  This  will 
the  assumption  This  process  is  called  dependency-directed  backtracking;  it 
described  in  (Doyle,  1978]  and  in  (Stallman  &  Sussman,  1977] . 


is 


Contrapositive  Deduction 


(»«'.«  ((  f  (  l«r  I  >*»  p  q ) ) 

1  9  (Mot  Q  )  )  ) 

(Assort  ’(not  p)  • (contropos it 1»0  :g  :f))) 

Ituto  ( (  : f  (pool  (Rot  p)  for  :goa)  in  contest)) 

(  g  (  l*pl  ttt  p  q ) ) ) 

( dreposo  tothod  ’(flothod  f  (Contraposttiro  chaining  :g))  '(CFC  :f  :$) 
( pool  •  os  tort  *  ( oo  t  a)  ,((»ot  p)  .  :poot )  :contait 

'(contropos tt loo-bact-cham  :f  g)))) 


DeMorgan's  Rules 


(Rot  ( o«d  o  b  c  ..))•>  (or  (rot  o)  (not  b)  (not  t)  ...) 

(Rot  (or  O  b  c  .  ))  •>  (And  (not  o)  (not  b)  (not  c)  ...) 

(ru'o  ((  f  (not  ( ond  «)))) 

(»ot  ( (  :dt|J  '(or  ,( Rope  or  '(1«bda(l)  '(not  ,*))  :«)))) 
(assort  dlsj  ' ( do -doaorpon  : f ) ) ) ) 

(r«to  ((  f  (not  (or  «)))) 

('ot  ((  conj  (and  .(oapear  '('nods  (1)  '(not  .«))  :«)))) 
(assort  :conj  •  ( doRorgondoaorgan  :fJJW 


Dependency  Directed  Reasoning 


4.2  The  Rules  Of  Inference  8S 


Negation  and  Quantifiers 


(Not  (Fpr-pll  van  p  q))  <•>  (Thtrq-is  :«an  :p  auch-that  (not  q)) 

(Not  (Thtr«-i»  :yar»  p  mch-that  q)  <•>  (For-all  :«iri  :p  (not  :q)) 

(Rul#  ((:f  (not  (for  all  var»  p  :  q  > ) ) ) 

(Snort  '(Thara-i»  vari  p  lucb-tbat  (not  :q)) 

‘(napatod-for-all  : f ) ) ) 

(•ula  ((  f  (not  (tbara-tt  vari  p  (uch-lhat  :q)))) 

(Assart  '(For- all  var»  p  (not  q)) 

' (nagatod- tn«ra- is  f))| 

(•ula  ((  F  (goal  (not  (For-all  wars  p  : q > )  for  :goa1  in  :conta«t))) 

(dropost  tbpd  ’(Motnod  f  (standard))  '(Ouant  :f) 

(goal-aitart  '(thara-is  :vars  p  such-that  (not  :q)) 

'((not  (for-all  :*ars  :p  :q))  .  goal) 
contest 

' (nagatad-f or-al 1 -standard- sub -goal  : f ) ) ) ) 

(Aula  ( (  :  f  |go»'  (not  fttiara  is  vers  p  such-that  q))  for  :goal  in  :contast))) 
(PrPrp,,  '(Matnod  f  (standard))  ‘(Ouant  f) 

(goe’  ii  irM  ’(for-pll  vars  p  (net  q)) 

'((not  (tbera-is  vars  p  q))  goal) 

:ctinle»t 

'  ( nags  tad- tbora -  is- standard- sub-goal  :  f  ) ) ) ) 


Section  4.3:  Closing  The  Reflexive  Loop 

So  f  ir  1  have  shown  the  use  of  the  task  agenda  only  in  the  context  of  theorem 
proving  However,  the  protocol  shown  above  is  actually  the  way  REASON  does 
an\ thing  which  needs  to  be  open  to  introspective  controL  Tasks  other  than  theorem 
proving  got!  ate  entered  into  the  system  by  making  a  tasx  assertion  which  is  treated 
in  much  the  nmc  mrmner  as  the  goal  assertions.  Such  assertions  stimulate  the 
prop"  '1  of  me th  >ds;  methods  are  chosen  by  the  acciatoa  just  as  was  shown  above. 

lrrtjuentl>  some  partial  ordering  must  be  imposed  on  the  execution  of  tasks. 
This  is  ifone  b\  ar.-erting  a  co*;cot.  fiov  assertion  mentioning  the  two  tasks  which  are  to 
be  ordered  Similarly  a  task  can  make  information  available  for  use  by  asserting  that 
some  object  h  one  of  it*  outputs  This  information  can  be  propagated  to  other  tasks 
b\  the  assertion  of  *tA-> . assertions  which  mention  the  output  port  of  the  first  task 


For  Complex  Program  Understanding 


86  Explicit  Control  and  The  Task  Network 


and  a:i  input  port  of  some  other  task. 

Of  course  this  is  just  the  primitives  which  are  used  to  describe  programs  in  the 
plan  diagram  formalism'!  In  fact,  I  am  currently  working  on  building  a  catalogue  of 
useful  plans  for  use  by  REASON  itself.  These  will  be  coupled  with  a  set  of  rule* 
which  state  that  a  useful  way  to  accomplish  some  task  is  to  apply  one  of  the  plans 
from  the  catalogue.  Of  course  these  plans  create  sub-tasks  and  REASON  will  have  to 
chose  methods  to  accomplish  each  of  these.  However,  it  is  often  the  case  that  there 
are  is  an  a  priori  good  choice  for  some  of  the  sub-tasks  of  a  particular  plan.  Thus, 
extremely  useful  pragmatic  information  can  be  provided  to  the  wano*  by  building 
rules  which  analyze  the  history  of  method  and  plan  selection  and  use  this  analysis  to 
select  methods.  This  is  roughly  the  approach  followed  in  [McDermott,  1977J. 

Although  this  is  so  far  past  the  current  development  of  REASON  that  it  now 
seems  to  be  science  fiction,  there  is  yet  another  advantage  to  this  approach. 
REASON  is  itself  a  program  written  to  analyze  programs;  the  language  in  which  a 
substantial  part  of  REASON  is  written  is  the  language  it  is  capable  of  analyzing. 
Therefore,  it  is  possible  for  REASON  to  analyze  itself!  (At  least  in  principle,  it  is 
possible). 

In  later  chapters,  I  will  often  refer  to  REASON’S  protocols.  These  are  sets  of 
tasks  to  be  entered  into  the  task  agenda;  they  are  represented  in  the  plan  language. 
In  the  next  section  I  will  develop  some  more  representations  used  in  the  reasoning 
system;  I  will  then  turn  to  a  deeper  look  at  the  plan  language. 

Section  4.4:  Equality,  Reference  and  Anonymous  Objeots 

If  one  wishes  to  show  that  a  program  has  a  certain  property  one  must  show  that 
whatever  inputs  the  program  is  given,  it  will  still  behave  in  accordance  with  the 
property.  The  method  used  in  REASON  is  to  evaluate  the  behavior  of  the  program 
when  presented  with  typical  inputs  Anonymous  Objects  [Hewitt,  19751 
[Rich  &  Shrobe,  1976),  [Moore  1975J,  (they  are  called  formal  objects  in 
[Sussm.in,  I97.'P  are  used  to  represent  such  typical  inputs.  An  anonymous  object  it 
one  whose  identity  is  unknown  in  the  sense  that  given  an  anonymous  object  and  any 
other  object,  it  is  a  priori  unknown  whether  or  not  the  two  objects  are  identical  (in 
the  sense  of  being  the  same  object). 


Dependency  Directed  Reasoning 


AD-A078  055  MASSACHUSETTS  INST  OF  TECH  CAMBRIDGE  ARTIFICIAL  INTE— ETC  F/6  6/4 

DEPENDENCY  DIRECTED  REASONING  FOR  COMPLEX  PROGRAM  UNDERSTANDING— ETC (U) 
APR  79  H  E  SHROBE  N00014-75-C-0643 

UNCLASSIFIED  AI-TR-503  NL 


2  of  4 

AO 

••  ='f 

.  .v  ;-**#• 

v 

-V 

-  m 

m 

’  ■•■  <■$ 

-•  Vi 

■ 

i  _ 

_ 

PI 

____  — • 

-v.sa.ay 

. 

*  m 

;ass«3 

: 

• 

:  i-,rp*3 

f.  C*jj 

-  -  V 

*  -  ^ 

— 

— 

— 

l-  :^3Wr- 

L  — 

-  1 

— 

pH 

■  BH 

•  •: 

•Vv: 

1 HMI 

--<nWP 

Vg§ 

~7r?na 

. 

•--j  -x-§tg 

US 

■n 

.  r  ess 

L  — 

. 

■  _  -  -  -£ 

’sa 

,\  -  H 

mu 

-  (v  7S 

;  ;  31 

-  ^ 

-  -  -r. 

ril 

- - -s^ 

Iv^ 

L—  . 

-  i 

-  - 

_ 

' 

f-fc.  -p.  T' 

£ 

1 7-ai 

iSB 

w  * 

V-il 

■P 

. 

Ur^JgSSa 

-  <**yaai 

■ 

*  ^ 

&s 

i  mM 

:  Js'Ssf 

-  ->•  *_!»& 

AM 

i-  / 

Vr.-‘ 

=  -s- 

,  * 

* 

•A  -  ’» 

r 

- 

t  '::i 

:  ,  m 

,.VtJ5 

is 

-a 

,  .-•■fcN 

(Jr1 

1— ~ 

-  - 

1 

\’ 

:  ic-=r^a 

■irf 

•  — 

•  - 

.  ':3S 

■  HI 

'  'Is 

7>i  as5 

I  ..V'i 

IL= _ _ 

t-  — 

_ 

1  — 

4.4  Equality,  Reference  and  Anonymous  Objects  87 


Anonymous  objects  provide  a  convenient  stand-in  for  unknown  information  of 
various  kinds.  For  example,  suppose  we  wanted  to  ray  that  the  third  field  of  the 

record  input  to  r«ociou«t  1  is  a  sorted  list.  If  we  did  not  know  exactly  what  item  was 

input  to  ptoctoutt  i  we  would  ha\r  =  i;  up  some  token  to  stand  for  that  item  and 

similarly  for  the  third  field  of  thi;  •  <>rd  We  might  do  this  as  follows; 


(  input  tb»r»cord  precddu't  1  Anon- I) 

(third  f i«ld  Anon  1  Anon  l ) 

(  tortod  I  lit  Anon  7) 

Notice,  however,  that  the  first  two  of  these  predicates  are  functional,  they  uniquely 
determine  their  last  argument  It  is,  therefore,  possible  to  use  a  more  concise 
notation: 


(tortod  lilt  [third-fit'd  [input  thf-rocord  pr OCOdurt ■ 1 ] ] ) 

Each  expression  in  brackets  <[  j)  is  read  "the  object  which  satisfies  ..."  and  refers 
to  the  unique  object  which  could  appear  in  me  final  position  of  the  equivalent 
predicate.  In  fact,  such  reference  expressions  can  be  constructed  for  predicates  which 
are  known  to  be  functional  in  any  position.  For  example,  if  we  knew  that  there  were 
a  unique  list  which  contains  as  a  member,  we  could  refer  to  this  list  as  follows 


[Hombor  lilt  tntry.|] 

in  which  the  variable  list  indicates  the  position  of  the  statement  which  is  being 
referred  to.  The  reference  expressions  above  are  further  abbreviations  of  this  notation 
in  which  the  last  position  is  the  variable  and  is  by  convention  simply  dropped 

Whenever  REASON  asserts  a  statement  with  reference  expressions  in  it,  it 
attempts  to  resolve  the  references.  Reference  resolution  involves  two  stages  First,  if 
there  is  an  object  which  satisfies  the  reference  expression,  that  object  is  substituted  for 
the  reference  expression.  Second,  if  no  such  object  exists,  an  anonymous  object  is 
created  to  satisfy  the  reference  expression.  For  example,  suppose  the  following  is 
asserted; 


For  Complex  Program  Understanding 


88  Explicit  Control  and  The  Task  Network 

»-•  (l*ft  pplr-l  C »■«•*•»  p«ir-/J) 

To  process  this  assertion,  REASON  must  resolve  the  reference  expression 


(right  p»ir  /) 

Le.  it  must  find  an  assertion  matching  the  pattern 

(right  nlr-1  obj) 

There  are  two  cases.  Suppose  the  data-base  already  contained  the  assertion: 

f-l  (tight  poir-J  Th»  bnivtr) 

Then  the  processing  would  be  completed  by  asserting: 

r-2  (l«M  polr-l  Thp-pniwtr)  ( l«rtr«nc« ~ Itiolut Ion  1  -•  P  - 1 ) 

If,  however,  the  data-base  contained  no  assertion  matching  the  pattern 

(tight  tolr-J  Obj) 

then  the  system  would  create  the  anonymous  object  osjtcT-i,  and  assert  that  it  satisfies 
the  reference  expression.  Notice  that  this  assertion  is  not  an  assumption  since  it  is 
not  really  saying  anything  new.  The  reference  expression  itself  only  says  that  there  is 
some  object  satisfying  the  expression;  resolving  the  reference  by  creating  the 
anonymous  object  merely  gives  this  object  a  name  Since  the  anonymous  object 
created  to  resolve  the  reference  might  be  equal  to  any  other  object,  this  new  assertion 
cannot  be  false;  therefore,  it  is  justified  as  a  premise,  te  its  truth  does  not  depend  on 
the  truth  of  any  other  assertion 

Since  resolving  the  reference  expression  creates  a  new  assertion,  processing  may 
now  proceed  as  above 


Dependency  Directed  Reasoning 


4.4  Equality,  Reference  and  Anonymous  Objects  19 


t-i  (MigM  pair-f  OtJtCt-l)  ) 

t-J  (l«ft  p«1r-l  OSJlCt-1)  (l«f (rtnct  ltiolut i»n  F -|  P - i ) 

Of  course,  there  is  no  restriction  on  the  nesting  of  reference  expressions,  so  the 
processing  is  recursive. 


Identification 

The  use  of  reference  expressions  raises  the  possibility  that  we  might  wind  up  with 
two  distinct  names  for  the  same  object  For  example,  in  resolving  the  reference 
expression  above  we  created  the  anonymous  object  osjtct-t  to  stand  for  the  right  part 
of  e*i«-i  and  from  this  we  deduced  that  the  left  part  of  e»t«-i  is  also  osjict-l 
However,  suppose  that  we  had  the  following  assertion  in  the  data-base: 


t  •«  (l*M  p»’r-l  tPtt-SS)  (iww-jwitifwatipn  ...) 

Since  a  pair  can  only  have  one  left  part,  it  must  be  the  case  that  utt-ss  and  outcr-i 
are  the  same  object.  We  are  then  faced  with  the  problem  of  what  to  do  with  these 
two  names  for  the  same  thing,  Le.  the  problem  of  handling  equality.  REASON  uses  a 
rather  unusual  tactic  in  this  situation.  The  standard  tactic  in  most  reasoning  systems 
is  to  build  up  equivalence  classes  of  equal  objects.  This  however,  imposes  a  price 
when  searching  for  a  match  since  one  must  check  for  variants  of  the  desired  assertion 
using  any  possible  representative  of  the  equivalence  class.  REASON  instead, 

eliminates  this  possibility  by  doing  the  work  in  advance;  it  makes  one  of  the  objects 
“disappear".  In  the  current  example,  there  is  really  no  further  use  for  the  name 
oejtct  i,  since  it  was  merely  created  as  a  stand-in  when  we  lacked  the  information  to 
know  what  the  right  part  of  r*i»-r  wav  However,  we  have  now  deduced  that 
information,  so  the  stand-in  is  unnecessary. 

Since  we  have  learned  that  osaict-i  is  really  utt-ss,  we  can  substitute  utt-ss  for 
o*j*cr-i  in  any  assertion  in  which  osatet-j  occurs  This  process  is  called  identification. 
To  make  this  possible,  REASON  maintains  an  index  of  which  assertions  the  various 
objects  occur  in;  ihe  index  is  represented  by  assertions  of  the  form 

(Occur). m  < object)  (lid  n«w)) 


For  Complex  Program  Understanding 


90  Explicit  Control  and  The  Task  Network 


Needless  to  say,  these  assertions  are  not  indexed  with  more  occurs- ir  assertions.  Every 
tune  a  new  assertion  is  added  to  the  data-base,  this  index  is  updated  When  an 
identification  is  required,  it  is  then  simple  to  retrieve  the  appropriate  assertions  and 
make  the  substitutions 

However,  this  process  as  explained  so  far  would  result  in  another  problem. 
Namely,  every  assertion  mentioning  osjict-i  is  now  parallelled  by  an  equivalent 
assertion  mentioning  urr-ss.  It  would  be  wasted  effort  for  both  of  these  assertions  to 
be  retrieved  every  time  some  information  was  desired  REASON,  therefore,  maintains 
a  mark  on  each  assertion,  called  the  utility  mark  which  serves  a  function  similar  to 
that  of  the  in-out  mark.  If  an  assertion  has  its  utility  mark  set,  then  it  is  regarded  as 
being  useless,  no  rule  will  trigger  on  it  and  it  will  not  be  retrieved  by  a  normal  fetch 
request  However,  it  is  still  regarded  as  being  true: 

As  REASON  goes  through  the  identification  process,  it  sets  the  utility  mark  of 
each  assertion  which  mentions  the  anonymous  object  being  identified  away  (mjict-i  in 
this  case).  Each  new  assertion  depends  on  both  the  to  assertion  and  the  original 
assertion  from  which  it  was  built  In  our  current  example,  we  have 


r-i 

(Right 

purl 

ORJtCT-l  ) 

r-i 

(Lift 

pur- 1 

ORJKT-l) 

r  « 

(lift 

pair- 1 

lpft-SS) 

As  noted,  r-i  and  r-4  imply  that  osjcct-i  and  urt-ss  are  identical  Thus,  an  it 
assertion  is  derived,  initiating  the  identification  process  The  following  assertions 
result 

r-s  (  id  oejtCT-i  itrrss)  inu  itMtifuiiiM  »•»  »-•) 

f-R  (R'gM  ppir-f  ItM-U)  (  IPpnt if  icpttpft  f-i  P-5) 

In  addition,  both  r-i  and  r-s  will  have  their  utility  mark  set  The  following  mka 
implement  this  process; 


Dependency  Directed  Reasoning 


4.4  Equality,  Reference  and  Anonymous  Objects  91 


( trl(g«r-ruU  ( (  :  f  (Id  obj-1  obj-l)) 

(  g  (occurs- tn  obj-l  fact-1))) 

( sot-util ity-aork  foct-l  'uitltll)  ;  aork  th«  foct  wiotoil 
(tot  ((  now-fact  (lublt  obj-l  : obj-l  : f oct- 1 ) ) ) 

(assort  : now- fact  ( identification  :f  ()))) 

(full  ((  f  (part  obj-typo  ptrtntmt)) 

(  Q  (typo  obj-typo  obj))) 

(rulo  ((h  (  port ■ nomo  obj  port-11) 

(  i  (  port-nano  obj  par t • 2 ) ) ) 

(or  (oq  part-1  port-J) 

(assart  (id  part-1  port-1) 

(part-idantif station  f  g  h  .1))))) 

It  is  important  to  understand  the  distinction  made  here  between  the  utility  mark  and 
the  notion  of  in  and  out.  In  and  out  deal  with  belief  (or  logical  relationships)  while 

the  utility  mark  is  strictly  an  issue  of  control  (of  heuristic  value).  A  fact  which  is  in 

but  whose  utility  mark  is  set  is  still  regarded  as  true  (or  believed)  even  though  it  will 
be  ignored.  This  is  crucial  since  the  justification  for  fact  r-a  above  is  r-i,  a  fact 
whose  utility  mark  is  set.  If  r-t  were  regarded  as  not  being  believed  (as  opposed  to 

simply  not  being  useful)  then  n  would  have  no  support  and  would  itself  be  out.  A 

fact  whose  utility  mark  is  set  may  support  belief  in  other  facts  although  its  presence 
will  otherwise  be  ignored. 

Whenever  the  truth  maintenance  system  notifies  REASON  that  the  belief  status 
of  an  10  assertion  has  changed  from  in  to  our,  REASON  will  remove  the  utility  mark 
from  each  assertion  which  had  previously  been  marked  This  situation  arises 
frequently  in  h\pothetical  reasoning,  when  for  sake  of  argument  the  system 
assumes  that  two  objects  are  identical,  leading  to  an  identification  process.  Later 
when  the  system  retracts  this  assumption,  the  10  assertion  will  become  out  removing 
the  support  for  the  covered  assertions.  When  the  utility  mark  of  an  assertion  is 
removed,  the  system  will  run  any  rule  which  matches  the  assertion  but  which  has  not 
yet  executed 


For  Complex  Program  Understanding 


92  Explicit  Control  and  The  Task  Network 


8eotion  4.5:  Situational  Logie 

So  far  1  have  ignored  the  need  to  represent  the  temporal  behavior  of  programs. 
The  temporal  nature  of  a  fact  is  indicated  by  tagging  an  assertion  with  a  situation  tag 
[McCarthy,  1968)  indicating  when  that  assertion  is  believed  to  be  true;  different  points 
in  time  are  represented  by  different  situation  tags.  Thus,  we  might  write 

((first  tist-1  obj-1)  situation-)) 

((first  list-1  obJ-2)  situation-!) 

to  indicate  that  the  first  object  in  iist-i  is  ou-i  at  one  point  of  time  while  it  is  oaa-< 
at  another.  This  does  not  imply  that  osj-i  is  identical  to  ou-t;  the  actual  rule  for 
identification  used  in  REASON  requires  that  the  situations  of  the  two  assertions  be 
identical 

It  is  often  the  case  that  we  need  to  make  reference  expressions  within  this 
temporal  notation  This  is  indicated  with  a  temporal  reference  expression  which  is 
denoted  using  braces  (|  ...  }).  For  example: 


((first  tlst-1  ((first  list-!]  situation- II  )  sHuatlon-1) 

says  that  the  object  which  is  the  first  element  of  ust-i  in  jitumioh-i  is  also  the  first 
element  of  usi-i  in  stiuatioa-!.  Notice  that  it  does  not  follow  from  this  that  there  is 
any  situation  in  which  the  first  element  of  usri  is  ever  the  same  object  as  the  first 
element  of  list  i.  A  temporal  reference  expression  has  two  part s  the  assertion 
expression  and  the  situation  expression;  either  of  these  may  be  a  simple  reference 
expression.  Temporal  reference  expressions  are  handled  in  essentially  the  same  way  as 
simple  reference  expressions  discussed  above 

Some  assertions  are  trans-situational  in  that  they  relate  assertions  or  objects  from 
different  situations,  to  assertions  are  an  obvious  example  of  this.  If  two  objects  are 
identical,  then  they  are  identical  in  all  situations  (for  all  time  as  it  were).  Thus*  is 
assertions  are  never  situationally  tagged. 


Dependency  Directed  Reasoning 


4.5  Situational  Logic  93 

Logical  connectives  inay  also  be  used  to  build  trans-situational  assertions.  For 
example: 

(Or  ((fxrit  li»t-l  ob 3 - 1 )  it)  ((Hast  •*]-{)  It)) 

is  trans-situationaL  It  is  not  true  in  any  situation,  but  rather  is  an  assertion  relating 
facts  in  different  situations.  These  will  be  important  in  chapter  11  where  1  discuss 
reasoning  about  side  effects 

REASON  also  allows  assertions  to  relate  the  states  of  objects  at  different  points 
of  time.  Suppose  we  wished  to  describe  the  behavior  of  the  MACLISP  a«ivt«si 

program  which  rexerses  a  list  by  changing  the  pointers  in  its  cells.  This  program 

works  by  side  effect;  xvr  want  to  say  that  the  list  which  is  the  output  of  this  program 

is  the  rexerse  of  ihe  list  which  is  the  input  However,  we  are  talking  about  the  same 

list  in  both  cases  since  the  program  works  by  side-effect  The  output  cell  is  the  one 
which  was  the  last  cell  of  the  input  list;  after  the  miv t«st  program  has  run,  this  cell  is 
the  head  of  the  reversed  list.  Suppose  that  si  is  the  situation  just  before  the  itivtasi 
program  executes  and  sr  is  the  situation  just  afterwards.  We  can  then  write: 

<[H|t  t  »l]>  »?]>) 

The  expressions  xvithin  angle  brackets  «(  ...  ]>)  is  called  an  object-state  expression.  It 
is  read  as  "the  state  of  ...  in  situation  ...  If  an  assertion  mentions  only  object-state 
expressions  and  the  situation  part  of  each  such  expression  is  the  same  situation,  then 
the  assertion  is  equivalent  to  a  situationally  tagged  assertion  mentioning  the  objects  of 
each  object-state  expression  and  tagged  with  the  common  situation  tag. 

(»  <(.i  tij'  <(.i  n)>  <[»«  »ij>) 

<•> 

((*■1  ■  * 

Notice  that  the  »tvr«st  assertion  above  is  not  reducible  to  a  simple  situationally  tagged 
assertioa  Hoxxexer,  consider  what  would  happen  if  the  «vt«s€  assertion  were  replaced 
by  its  definition.  We  define  »ivrxsi  recursively,  saying  that  one  list  is  the  reverse  of 
the  other  if  the  first  object  of  one  is  the  last  object  of  the  other  and  if  the  rest  of 
the  first  list  is  the  reverse  of  the  fragment  of  the  second  list  beginning  with  the  first 
object  and  continuing  up  to  the  last  (I  will  be  more  specific  about  how  such 


For  Complex  Program  Understanding 


94  Explicit  Control  and  The  Task  Network 

definitions  are  stated  in  chapter  10).  We  would  then  obtain  the  following: 


( Rtvcnt  <[11  S1J>  <[U  »!])) 

<•>  (And  (T  Irit  <[  1 1  »1)>  [lot  <[11  .!]>  )  ) 

(R«*«nt  [ftit  <[11  il]>)  [Itddlnf.frdfMnt  <[11  »l)>  ))) 

this  will  lead  to  reference  resolution  as  shown  earlier.  However,  as  we  begin  to 
resolve  the  references  we  will  see  that  many  of  the  expressions  are  simple  reference 
expressions  (i.e.  without  situational  tags)  which  involve  only  object  state  descriptions 
from  a  single  state  For  example,  from  the  first  clause  we  obtain  the  following: 


[«•»»  <[11  i!)>  J 

Using  the  rule  stated  above  this  is  resolved  to 

(( lilt  11  inon-l)  i!) 

and  the  value  of  the  reference  expression  is  uor-l  Thus,  following  the  same  rule,  the 
enclosing  clause  becomes 


( (  f  »Mt  1 1  in«n  1 )  •! ) 

If  we  continue  this  process  we  will  ultimately  wind  up  with  only  simple  situationally 
tagged  assertions  Actually,  we  also  wind  up  with  a  second  trans-situational  atvtasc 
expression  However,  by  induction,  this  will  also  lead  to  a  set  of  simple  situationally 
tagged  assertions,  none  of  which  are  trans-situationaL  If  a  defined  relation  including 
object-state  descriptions  can  be  reduced  to  simple  assertions  which  are  not  trans- 
situational,  then  the  relation  makes  sense  as  a  trans-situational  assertion  atvtut  is 
such  a  relation  Membership  in  a  data-structure  is  not;  an  examination  of  the 
definition  of  list  movcismi'  for  example,  shows  that  any  trans-situational  use  of 
membership  is  incoherent 

A  similar  reduction  can  be  applied  to  trans-situational  assertions  built  from 
logical  connectives  in  which  each  clause  is  tagged  with  the  same  situation. 


Dependency  Directed  Reasoning 


4.5  Situational  Logic  9S 


(Or  (M  ll)  .  »i)  .  .  (»«  »»)) 

<•> 

((Or  H  *«)  »1) 

A  similar  rule  applies  to  negation: 

(not  (P  it))  <•>  ((not  P)  It) 


For  Complex  Program  Understanding 


96  Describing  Programs 


Chapter  5:  Describing  Programs 

I  will  use  two  distinct  methods  of  describing  program  segments.  The  first  of 
these,  called  specs,  is  a  formalism  for  specifying  a  segment's  input/output  behavior. 
The  second,  called  plan  diagrams  is  used  to  build  a  complex  segment  by  connecting 
together  simpler  ones.  Intuitively,  the  specs  represent  the  properties  of  the  program  to 
be  proved  and  the  plan  diagram  represents  the  program.  Analysis  consists  of  showing 
that  the  behavior  which  results  from  a  plan  diagram  is  that  required  by  the  segment's 
specs. 


Section  6.1:  Speos  -  I/O  Descriptions 

Simple  specs  consist  of  4  sets  of  clauses;  Inputs,  Outputs,  Expects  and  Asserts. 
The  first  two  of  these  are  simply  lists  of  internal  names  or  pons  for  the  data  objects 
which  are  the  inputs  (outputs)  of  the  segment  being  specified  The  expect  clauses  are 
a  set  of  requirements  which  must  be  satisfied  at  the  time  the  segment  is  applied  to  its 
inputs.  Typically  these  are  type  constraints  or  simple  relationships  between  the  input 
objects.  Finally,  the  assert  clauses  are  a  set  of  conditions  which  are  promised  to  hold 
immediately  after  the  segment  has  finished  its  execution  The  assert  clauses  may 
mention  both  input  and  output  objects,  providing  a  convenient  method  for  describing 
side-effects  on  the  input  objects. 

We  can  use  the  specs  formalism  to  specify  a  program  which  calculates  the  fringe 
of  a  tree  as  follows; 


(dtr>p«ct  fr,n„ 

( Inputt  tha-traa) 

(t'Ptct  (Otjact  typa  tha-traa  I  inary- trot ) ) 

(Output*  tha-frlnga) 

( A**art  (Objact-typa  tha-frlnga  lit!) 

(tar-alt  (  tha-noda)  (laat-noda  tha-traa  tha-noda) 
(■aidar  tha-frlnga  tha-noda)) 
(tar-alt  (  tha-noda)  ( ■antar  tha-frlnga  tha-noda) 

(laat-nada  tha-traa  : tha -noda ) ) ) ) 


Dependency  Directed  Reasoning 


5. 1  Specs  -  I/O  Descriptions  97 


Spec  clauses  are  written  in  a  variant  of  the  predicate  calculus  which  uses  the  pattern 
matching  syntax  of  artificial  intelligence  languages  and  which  uses  the  situation  tag 
notation  of  the  situational  calculus  [McCarthy,  1968}  The  identifiers  preceded  by 
colons  (e.g.  thi  nodi)  are  variables;  thus,  the  two  quantifiers  say  that  every  leaf  node 
of  the  tree  is  represented  in  the  output  of  rtiaet  and  conversely  that  only  the  leaf 
nodes  are  represented.  Where  it  is  possible  to  unambiguously  omit  the  situation  tag 
on  a  predicate  we  do  so  This  is  almost  always  possible,  since  the  use  of  distinct 
input  and  output  names  for  the  same  object  defines  which  situation  is  meant  For 
example,  the  first  clause  of  the  first  quantifier  above  only  mentions  input  objects  and 

is,  therefore,  taken  to  apply  in  the  input  situation.  The  second  clause  refers  to  an 

output  obiect  and,  therefore,  refers  to  the  output  situation  In  cases  where  this  in  not 

possible  the  two  special  symbols  *i{ro«t*  and  are  available  as  names  of  the 

input  and  output  situations.  When  specs  are  used  in  the  symbolic  evaluation  process, 
the  symbolic  evaluator  defaults  in  the  appropriate  situational  tags 

Specs  may  also  have  a  case  structure  which  reflects  the  ability  of  the  segment  to 
cause  control  branching.  This  is  done  by  adding  cist  clauses.  For  example  a  test 
which  checks  whether  a  node  is  a  leaf  node  can  be  specified  as  follows; 


(diftpici  lull 

(  '"put*  thf • nodi ) 

(fptet  ( Ot]tc  t  •  t/p*  thf. nodt  bmiry- trtt-nodf ) ) 

(Cltl-I 

(whin  (Objtct  tjp»  thf-nod*  lit?))) 

(Clift 

(whin  (not  (ObJICt-tyPI  th|-nodl  Lilt))))) 

This  says  that  when  the  input  node  is  a  leaf  we  take  one  control  branch  and  when  it's 
not  we  take  the  second  branch.  As  above,  the  segment  has  expect  clauses  which  must 
be  satisfied  This  segment  produces  no  outputs  and  has  no  side-effects.  There  are, 
therefore,  no  outputs  or  assert  clauses.  Segments  have  any  number  of  cases  and  these 
cases  may  have  outputs  and  assert  clauses  nested  within  them.  This  allows  us  to 
specify  complicated  segments  which  create  control  branches  as  well  as  producing 
outputs.  A  ioo«u»  routine  for  a  complex  data-structure  might  have  such  specs. 


For  Complex  Frogram  Understanding 


98  Describing  Programs 

It  is  often  necessary  to  state  that  several  segments  share  the  same  I/O  behavior. 
One  reason  for  this  is  that  there  are  tasks  for  which  several  distinct  algorithms  exist; 
these  different  algorithms  lead  to  distinct  segments,  but  their  specs  are  identical 
There  is  a  second  need  for  saying  that  different  segments  have  identical  I/O  behavior. 
Consider  the  following  code  for  the  fringe  program: 


(dpfun  fringe  (tree) 

( fringe  !  tree  nil )) 

(defun  fringe  I  (tree  etc) 

(cond  ((leeff  tree)(coni  tree  etc)) 

(t  (fringe  !  (left  tree )( f r mge- 1  (right  tree)))))) 

Notice  that  there  are  two  recursive  calls  to  »*i»gi  1.  Thus,  there  are  three  instances 
of  fame*  which  we  might  want  to  distinguish  for  some  purposes  while  still  maintaining 
the  awareness  that  these  segments  have  a  common  I/O  specification. 

The  name  in  a  defspecs  statement  is  therefore  regarded  as  a  spec- type,  rather 
than  as  the  <pecs  for  any  particular  segment  If  we  need  to  indicate  that  a  segment 
has  the  specs  in  a  defspecs  statement,  we  state  that  its  spec-type  is  the  spec-type- 
name  in  the  defspecs  clause.  Thus,  we  could  say  that  the  two  recursive  calls  in  the 
fringe  program  have  the  same  I/O  behavior  as  follows: 


(iptctypt  I tf I  - f  r Ingt  fringe) 

(iptctypt  right  fringe  fringe) 

where  is  the  recursive  call  to  r»iwr  for  the  left  sub-tree  and  •  icMT-rumst  is 

the  recursive  call  for  the  right  sub-tree.  This  does  not  yet  allow  us  to  state  that 
these  two  instances  of  mwi  have  identical  internal  structure.  That  is  the  subject  of 
the  next  section. 


Dependency  Directed  Reasoning 


5.2  Plan  Diagrams  99 


Section  5.2:  Plan  Diagrams 

Plan  diagrams  arc  a  method  of  building  a  program  segment  by  linking  together 
the  behaviors  of  smaller  segments.  In  talking  about  a  plan  diagram  there  will  always 
be  a  main  segment  (i.e.  the  segment  described  by  the  plan  diagram)  and  a  set  of  sub- 
segments  which  arc  being  linked  to  form  the  main  segment  In  turn,  some  of  these 
sub-segments  will  have  plan  diagrams  and  internal  segments  of  their  own.  Thus,  there 
may  be  seseral  lc\ els  of  aggregation  within  a  plan  diagraia 

Segments  within  a  plan  diagram  may  be  joined  by  two  kinds  of  links--  data-flow 
and  control  flow  Since  specs  give  a  unique  name  to  each  input  and  output  object  of 
a  segment,  we  max  specify  how  data  flows  between  segments  by  stating  which  object- 
name  of  the  first  segment  flows  to  which  object-name  of  the  second.  For  example 
two  sub  segments  at  the  same  level  of  detail  might  be  linked  by  having  the  output  of 
one  flow  as  an  input  to  the  other. 


(output  <  Id- 1>  tobjftt  "•"•!>) 

(  input  i  td-2>  < Ob JtC t • ntwo - 2 ) ) ) 

The  data  flow  above  is  referred  to  as  an  output-input  link;  this  is  the  only  type  of 
data  flow  link  which  can  connect  two  sub-segments  of  a  common  plan.  When  we  are 
concerned  with  the  links  between  the  main  segment  and  its  sub-segments  there  are 
two  other  kinds  of  data  flow  links;  Input- Input  in  which  an  input  of  the  main  segment 
is  passed  directly  to  a  sub  segment  as  one  of  its  inputs,  and  Output-Output  in  which 
the  output  of  a  sub  segment  is  passed  to  the  main  segment  as  one  of  its  outputs. 

Control  flow  links  are  included  for  two  purposes.  A  simple  control-flow  link 
states  that  one  segment  must  finish  its  execution  before  the  other  segment  may  begin 
to  execute.  This  is  included  so  that  segments  with  side-effects  can  be  properly 
ordered  to  avoid  destructive  interference.  Data-flow  links  imply  an  ordering 
relationship  as  well,  since  a  segment  may  not  execute  until  all  its  inputs  have  arrived. 
Other  than  tht  .e  constraints  there  is  no  ordering  imposed;  plan  diagrams  are  always 
interpreted  in  a  (pseudo)  parallel  manner,  even  though  I  use  them  to  analyze  the 
behavior  of  sequential  processes. 


For  Complex  Program  Understanding 


100  Describing  Programs 

The  more  complicated  use  of  control-flow  links  is  to  specify  where  control  will 
go  from  a  segment  whose  specs  split  into  cases.  Conditional-control-flow  links  connect 
a  particular  case  of  a  segment  to  its  succeeding  segment  Thus,  if  a  segment  has  two 
cases  (a  typical  test)  there  will  be  one  conditional-control-flow  link  for  each  case 
leading  to  that  segment  which  should  next  be  evaluated  if  that  particular  case  is 
applicable.  A  segment  which  terminates  a  conditional-control-flow  link  cannot  execute 
unless  the  initiating  case  of  the  link  is  applicable 


Conditional-Control-flow  Links 


! 


/ 


If  should  be  noted  that  control-flow  links  are  not  connected  to  a  particular  port  of  a 
segment  in  the  way  that  a  data  flow  is.  The  symbolic  interpreter  which  I  will 
describe  in  the  next  chapter  interprets  the  control-flows,  since  they  are  extrinsic  to  the 

segment.  The  presence  of  a  control-flow  is  in  no  way  related  to  any  intrinsic 
specification  of  the  segment  i 


Dependency  Directed  Reasoning 


5.2  Plan  Diagrams  101 


One  final  tool  used  in  plan  diagrams  is  the  join  which  is  the  inverse  of  a  case¬ 
splitting  segment.  A  join  merges  several  distinct  control  paths  produced  by  a  case 
split  into  a  single  synthesizing  control  path.  The  join  takes  a  set  of  input  objects 
from  each  control  path  and  produces  a  set  of  output  objects  on  the  synthesizing 
control  path.  It  has  a  set  of  input  expectations  specifying  conditions  which  must  hold 
true  of  the  input  objects  flowing  to  it  from  each  control  path  and  assert  conditions 
about  the  merged  output  objects.  Since  the  control  paths  which  terminate  at  a  join 
are  required  to  originate  from  a  common  case  splitting  segment,  it  is  impossible  for 
more  than  one  of  these  paths  to  be  active  at  any  one  time.  Thus,  the  join  is 
applicable  when  one  of  its  incoming  control-flow  links  is  active  and  the  associated 
input  objects  are  available. 

The  raimt  program  employs  a  typical  use  of  the  joia  r*i»«  tests  whether  the 
current  node  is  a  leaf.  If  so,  the  leaf  is  accumulated.  Otherwise,  recursive  calls  are 
made  on  the  left  and  the  right  nodes,  producing  an  accumulation.  These  two  controls 
paths  and  their  respective  accumulations  are  then  synthesized  by  a  join  which  states 
that  whichever  path  was  taken,  the  accumulation  returned  includes  all  the  leaf  nodes 
of  the  input  nosle. 

We  may  use  these  notions  to  represent  the  fringe  program  diagrammatically  as 
follows: 


For  Complex  Program  Understanding 


102 


Describing  Programs 


Schematic  Plan  Diagram  For  Fringe 


Notice  that  in  the  above  diagram,  the  "railroad  track"  line*  represent  control-flow, 
while  the  solid  lines  represent  data-flow. 

This  diagrammatic  representation  of  the  program  has  a  direct  translation  into 
specifications,  data-  and  control-flow  assertions  as  showm 


Dependency  Directed  Reasoning 


I 


3.2  Plan  Diagrams  103 


tlllHn  (  Input  fringe  tM-tr«l) 

(input  leaf  f  the  node)) 

dataflow  ( input  fringe  the-tree) 

(input  proceis  non- lt»f  the-tree)) 
dataflow  (input  fringe  the- eccunu  1  at  ton  ) 

(input  proceii -non- leaf  the-eccidtuletion) ) 

dataflow  (input  fringe  the ■  eccunulel ion) 

(input  eccu»uleta  -  f  ring*  tha- accuaul  at  ion) ) 
dataflow  (input  frmgt  tha-traa) 

(input  accuauleta-fringe  the-new-ele«ent ) ) 

dataflow  (input  procati -non- loaf  tha-traa) 

( input  laf t  tha-traa  )) 
dataflow  (input  procett-non  laaf  tha-traa) 

(input  right  tha-traa  )) 

dataflow  (output  right  tha-noda) 

(input  righl-frmgt  tha-traa)) 
dataflow  (input  procati -non- leaf  tha-eccuauletlon) 
(input  right-fringa  tha  -  act  Haul  at  ion  ) ) 

dataflow  (output  loft  tha-noda) 

(input  laft-fnnga  tha-traa)) 
dataflow  (output  right-fringa  accumulation) 

(input  laft-fnnga  tha  -  accuaul  at  Ian ) ) 

dataflow  (output  laft-fnnga  accuaulation) 

(output  procati -non- laaf  accuaulation)) 

dataflow  (output  procati -non- laaf  accuaulation) 

(input  (  join-  f  rmga  cat  a-/  tha  -  accuaul  at  ton ) ) ) 
dataflow  (output  ecctmulete-fnnge  tha-accuaulatian) 

(input  (join-fringe  cata-l  tha- accuaulation))) 

dataflow  (output  join-fringe  tha  accuaulation) 

(output  fnnga  tha -eccuaulet  ion ) ) 

conditional  -  control -f lew 

((tatt-laaf  cate-1)  procatt-non-leaf )) 
condit  tonal -control -flow 

((teit-leaf  cata-I)  accuauiata- fringe)) 

control  -  flow  accuaulete-frtnga  (join-fringe  cata-l)) 
control-flow  procati -non- laaf  (join-fringe  cate-?)) 


For  Complex  Program  Understanding 


104  Describing  Programs 


(ipcc-typ*  ttlt-lMf  Itlff) 

(  I  pee -  type  I *f  t  -  f  ring*  fringt) 

(iptc-typ*  right-fring*  fringe) 

(iptc-typt  *cci*nul*tt-f  ring*  cent) 

If  two  segments  are  internally  identical,  Le.  if  they  have  identical  internal 
structure  then  we  say  that  they  have  the  same  plan-type.  It  follows  that  if  sismemt-i 
and  sigwht •  i  are  of  the  same  plan-type  then  they  are  also  of  the  same  spec-type. 

A  plan  type  is  defined  using  six  clauses: 

(i)  A  list  of  sub-segment  names  (we  will  often  refer  to  these  names  as  roles  of  the 
plan). 

(ii)  A  set  of  type  constraints  on  the  sub-segments,  Le  a  list  of  what  plan-type  or 
spec- type  constraints  they  satisfy 

(iii)  A  set  of  data-flow  links. 

(iv)  A  set  of  control  flow  (and  conditional-control-flow)  links. 

(v)  A  set  of  input  names. 

(vi)  A  set  of  output  names. 

This  may  be  specified  as  follows: 


(tftfpltn  fringt 

I  lot  ugwnti  tPit-!««f  Itf  t- f ring#  righl-fringt 
Wft  right  KtimUtl-f  r  mgt  ) 

(conitrpmtl  (iptc-typ*  Itll-lttf  Ittff) 

(plan  typ*  Itft-friftg*  fringt) 

(plan  typ*  righl-fringt  fringt) 

(iptc-typt  acci*mlatt-fr  ingt  ctnt) 

(iptc-typt  t*M  1  tf t ) 

(iptc-typt  right  right)) 

( f  lov-tfiagraa  (dataflow  ) 

...  )) 

If  two  segments  have  the  same  plan-type,  then  their  internal  structure  is  identical 
to  the  degree  of  detail  specified  in  the  plan  diagram.  Thus,  their  internal  temporal 
behavior  is  identical  Notice  that  the  plan  diagram  for  the  plan-type  mini  includes 
two  sulv  segments  whose  plan-type  is  also  must.  It  follows  that  if  we  wish  to  prove 
some  temporal  property  of  the  f*i»M  plan-type  we  may  do  so  by  an  inductive 
argument,  assuming  that  this  property  holds  of  the  two  sub-segments  of  plan-type 
ftmr.1  and  deriving  the  desired  property  of  the  main  segment  It  is  important  to 
remember  the  distinction  between  plan-types  and  spec-types.  Knowing  a  segment’s 


Dependency  Directed  Reasoning 


5.2  Plan  Diagrams  105 


spec-type  will  not  help  in  inductive  proofs  of  it  internal  temporal  properties;  spec-type, 
in  contrast  to  plan-type,  is  strictly  concerned  with  I/O  behavior. 


For  Complex  Program  Understanding 


106  A  Symbolic  Interpreter  for  Plan  Diagrams 


Chapter  6:  A  Symbolic  Interpreter  for  Plan  Diagrams 

In  this  chapter  I  will  describe  how  REASON  proves  properties  of  a  plan  diagram 
through  a  process  called  symbolic  interpretation  [Rich  &  Shrobe,  19761  [King,  19761 
[Smith  &  Hewitt,  19751  [Yonezawa,  19771  This  process  is  extremely  thorough, 
recording  all  dependencies  between  the  various  statements  in  the  plan  diagram  and  in 
the  sub  segments'  specs.  The  mechanisms  explained  here  lay  the  groundwork  for  being 
able  to  describe  the  internal  temporal  behavior  of  a  segment;  this  will  be  used  in  the 
next  chapter  where  I  will  develop  a  more  powerful  set  of  descriptive  tools  used  in  the 
process  of  plan  recognition 

Recall  from  Chapter  4  that  a  situation  is  defined  to  be  a  point  of  time  during  a 
computation,  and  that,  in  general,  facts  are  true  in  a  particular  situatioa 

(<A»l*rtlon>  (Situation)) 

*  9  ((tint  ltft-1  objrct-1)  Situation  1) 

An  application  represents  the  result  of  applying  a  segment  to  a  set  of  input 
objects  which  satisfy  the  expectations  of  the  segment,  yielding  a  set  of  output  objects 
which  satisfy  the  assert  clauses  of  the  segment  An  application  consists  of  (0  a 
segment,  ( n)  a  set  of  input  objects  and  a  mapping  of  these  objects  to  the  input  names 
of  the  segment,  (iii)  a  set  of  output  objects  and  a  mapping  of  these  objects  to  the 
output  names  of  the  segment,  (iv)  an  input  situation,  and  (v)  an  output  situatioa 
This  is  represented  as  follows 

(i)  (Sagmanl  part  (application'  < Itgmant -na»«>) 

(li)  (Input  <  »pp! icat tom  Oapaont ■ input  objact •"«•€>  <objact>) 

(in)  (Output  -  appl icat 1on)<tag««nt -output -Objact -naMO)  <objact>) 

(iv)  (  Input -»  i  tuat  ion  'application)  <ntuation>) 

(v)  (Output -» i tuat ion  (application)  'situation') 

Each  of  these  relations  is  a  function  (Le.  uniquely  determines  its  last  argument);  we 
may,  therefore,  use  such  descriptions  to  refer  to  an  object  unambiguously,  using  the 
bracket  notation  defined  in  Chapter  4: 


(conoi  botori  (  Input  !  1  tuat  ion  apt* Uot ton- 1  ]  Htuatlon-S) 


Dependency  Directed  Reasoning 


6  A  Symbolic  Interpreter  for  Plan  Diagrams  107 


which  says  that  the  input  situation  of  mpucatioh  i  precedes  situation s. 

A  plan  diagram  for  a  segment  determines  those  applications  which  will  take  place 
during  the  segment’s  execution.  It  also  determines  the  set  of  input  and  output 
situations  of  these  applications.  Finally,  the  plan  diagram  determines  a  partial 
ordering  on  these  situations  which  represents  the  minimal  ordering  constraint  on 
segment  execution  consistent  with  correct  execution.  This  information  can  be  made 
explicit  by  a  symbolic  interpretation  of  the  plan  diagram. 

Given  a  set  of  input  objects  for  a  main  segment  we  may  interpret  the  plan 
diagram  in  the  much  the  same  way  as  a  LISP  interpreter  interprets  its  code. 
However,  since  our  concern  is  with  the  general  behavior  of  the  plan  diagram,  the 
input  objects  will  lx-  ssmbolic  values  representing  typical  inputs.  Thus,  any  behavior 
which  can  lx-  shown  to  result  from  applying  the  segment  to  these  symbolic  objects 
must  necessarily  also  lx?  true  when  applied  to  any  actual  input 

I  ha\e  so  far  shown  specs  and  plan  diagrams  as  "packages”,  it.  as  a  single  large 
set  of  statements.  However,  in  order  to  build  dependencies  correctly,  the  various  sub¬ 
parts  of  these  packages  must  be  accessible  as  individual  statements.  REASON, 
therefore,  expands  plan  diagrams  and  specs  into  an  internal  format  in  which  each 
separate  idea  is  represented  as  an  individual  fact  We  will  see  how  these  are  used  as  I 
explain  the  symbolic  evaluation  process 

The  interpretation  begins  by  creating  an  anonymous  object  to  stand  for  the 
current  application  of  the  main  segment.  For  simplicity  this  name  is  always  chosen  as 
the  plan  type  name  of  the  diagram.  REASON  then  proceeds,  assigning  the  input 
objects  to  the  appropriate  input  ports  of  the  main  segment  As  each  object  is 
assigned  to  an  input  port,  the  symbolic  interpreter  adds  an  assertion  to  the  data-base 
stating  the  assignment.  For  example,  if  ust-i  is  the  input  object  matched  to  the 
input  port  T«t  cu»#(*t  ust  of  application  a-i  then  REASON  would  assert 

(Input  A- 1  Th* -Current -l l»t 

The  expect  clauses  are  then  substituted  into,  replacing  each  of  the  segment's 
input  port  names  by  the  actual  input  object  which  is  assigned  to  that  port  A 
situation  is  created  to  serve  as  the  input  situation  of  the  main  segment  and  the 
substituted  expect  clauses  of  the  main  segment  are  assumed  to  hold  in  this  input 
situation. 


For  Complex  Program  Understanding 


108  A  Symbolic  Interpreter  for  Plan  Diagrams 

Each  input  port  of  the  main  segment  is  connected  via  at  least  one  data-flow  link 
to  an  input  port  of  some  sub-segment  Intuitively,  the  data-flow  link  transports  the 
object  from  the  specified  port  of  the  main  segment  to  that  of  the  sub-segment 
Whenever  the  symbolic  evaluator  sees  that  a  object  is  bound  to  a  port  which  is 
connected  to  the  initiating  side  of  a  data  flow  link,  it  simulates  the  data-flow  by 
assigning  the  same  object  to  the  port  which  terminates  the  data-flow.  The 
justification  for  the  assertion  stating  this  assignment  points  to  the  data-flow  link  and 
the  assertion  stating  the  assignment  of  the  object  to  the  initiating  port  For  example, 
if  we  had  the  following: 


M  (Input  A-l  Thu  Current-Hit  lut-l) 

l-l  ( S»q*«nt  e»rt  Sut-S«f-t) 

F  1  (D«t«-flo*  (Input  »1  T»>«-Currp«t.lut) 

(Input  Su»-S««-t  Th*  ••ulllt)) 

then  REASON  would  assert: 

F4  (  Input  •/  T»>«  Mwl'lt  lllt-1)  ( «f  lpw  f  - 1  F  - 1  F  - }) 

When  all  of  a  subsegment's  input  ports  have  been  assigned  input  objects  the  aub- 
segment  is  ready  for  application,  an  application  name  is  created  and  asserted  to  be  the 
application  name  of  the  current  invocation  of  the  sub-segment. 

If  the  segment's  spec-type  is  provided  in  the  plan  diagram  then  application 
proceeds  as  follows:  First,  a  situation  is  created  to  serve  as  the  input  situation  of  the 
sub  segment,  for  example: 


F  S  <  Input -S  Mupt  «o<i  »-l  Situat  tpn-r- In) 

Next,  since  the  expect  clauses  of  the  segment's  specs  are  required  to  be  true  in  this 
input  situation,  REASON  creates  a  goal  to  show  that  each  expect  clause  holds.  These 
goals  are  the  expect  clauses  with  the  input  objects  substituted  for  their  corresponding 
input  names.  Each  such  goal  assertion  has  a  dependency  pointing  to  (0  all  of  the 
!»Fu»  assertions  relevant  to  that  clause,  (ii)  the  spec  clause  from  which  the  goal  is 
built,  and  (in)  the  assertion  stating  the  input  situation  of  the  application  If  all  of  the 
goals  are  satisfied  the  segment  is  applicable,  otherwise  the  plan  has  an  error.  The 
assertion  that  the  segment  is  applicable  is  justified  by  a  dependency  which  points  to 
the  satisfied  assertions  for  each  expect  clause  goaL  Thus,  if  sua-sts-i  has  expects  t-i 
and  it  and  if  svi  ut  ;  has  spec-type  jfcc-j  then  the  following  assertions  are  created: 


Dependency  Directed  Reasoning 


6  A  Symbolic  Interpreter  for  Plan  Diagrams  109 


r-s  (Sptc-typ*  Sub  s«j  ;  sp«ci) 
r-r  ( sp«c  c i tuit  sp«c)  t>p«ct 

(l-J  Th*-H»»-L1»t )) 

r-a  (Sp*c-Ct*ut*  Sp»c3  l iptc t  Cul  l  Cl*u»*-2 
(t-2  Th«  ».«-u»t )) 

l«  (Go*t  ((1-1  lllt-l)  Situ»t ion-2- In)  ([ipCUuil  MM  M  !•<  f-l) 

for  (  E  iptc  t -c  Uuif  o'  A-2)  in  ()) 

Ml  (Goal  ((1-1  litl-1)  Si  tuot ion-2- In)  (EipCIauia  F-S  f-|  f-S  F-A  f-{) 

for  ( ( iptc t ■ c lautt  of  t-{)  in  ()) 

notice  the  use  of  the  "for"  part  of  the  goal  assertion  to  indicate  that  the  higher  level 
task  from  which  the  goal  arose  is  the  symbolic  interpreter's  expect  checking  routine. 

When  these  goals  are  satisfied  we  obtain  the  following: 

f-Zt  (latufiad  (goal  ((E-l  li»t-()  Situation-2- In) 
for  (lipact-clauia-of  A-2)  in  ())) 

f -II  (latufiad  (float  ((E-2  li»t-l)  Sltuat  ion-2- In) 
for  ( tapac t  c 1 auia ■ of  A-2)  in  ())) 

E-22  (Applicant*  A  2)  (a>p*ctiiatiifi*d  f-21  f-21) 

When  the  segment  is  shown  to  be  applicable  a  new  situation  is  created  to  serve 
as  the  segment’s  output  situation.  If  the  segment’s  specs  specify  that  any  of  the 
outputs  are  new  objects  lit  created  within  this  segment’s  execution),  then  the 
interpreter  creates  object  names  for  these  outputs  and  assigns  then  to  the  appropriate 
output  ports.  This  is  done  using  an  output  assertion  which  is  similar  to  that  iaput 
assertion  shown  above.  The  assert  clauses  of  the  specs  are  instantiated,  replacing  each 
(input  or  output)  port  name  by  the  name  of  the  object  assigned  to  that  port  The 
appropriate  situational  tags  aie  also  defaulted  into  these  assertions  These  instantiated 
assertions  are  then  asserted  with  justifications  which  point  to  the  statement  that  the 
segment  is  applicable,  to  the  spec-type  assertion  for  this  segment,  to  the  assertion 
representing  the  actual  spec  clause  from  which  this  assertion  is  built,  and  to  the 
relevant  iaput  and  output  assertions  Suppose  that  a  2  is  determined  to  be  applicable 
and  that  it  produces  an  output  named  out-hst.  Suppose,  further,  that  the  assert 
clauses  of  this  segment  specify  that  its  output  is  to  be  sorted.  Then,  REASON  would 
create  the  new  object  our-usr-i  to  stand  for  this  output,  asserting: 


For  Complex  Program  Understanding 


110  A  Symbolic  Interpreter  for  Plan  Diagrams 


F-Ai  ( Output • S i tu«t ion  A-2  S i tuit ion-2 -Out ) 

F-«l  (Output  A-J  Out-lilt  Out-lilt- 1 ) 

F-AJ  (Sptc  Clouto  SpicJ  Antrt  Coil-I  Clouil-1  (Sort*!  Out-lilt)) 

111  ((Sort.o  Out -L  lit  - 1 )  Situotion  2  Out)  (Output-Amrt  Mt  F-Al  F-«l  FA*  F-*  F-2) 

The  output  ports  of  the  sub-segment  just  interpreted  are  linked  to  other  segments 
via  data-flow  links.  These  may  terminate  at  either  input  ports  of  other  sub-segments 
or  at  output  ports  of  the  main  segment  If  the  data-flow  link  terminates  at  another 
sub-segment's  input  port,  the  object  assigned  to  the  output  port  of  the  first  segment  is 
then  assigned  to  the  input  port  of  the  second.  This  process  produces  assertions  like 
those  created  by  the  data-flows  from  the  main  segment's  to  sub-segment's  input  ports.. 
For  example,  if  the  output  of  a-?  flows  to  sui-sis-i's  soatioust  input  we  would  get  the 
following  assertions; 

F-1M  (DotoMov  (Output  Sub-Sfg  i  Out-lilt) 

(Input  Sub-Stg-J  SorttA-l lit ) ) 

F  -  MM  (Stgmont  AoM  A-J  SubStgJ) 

F  -  It?  ( Input  Situation  A-J  Situation- )- In) 

F  1AJ  (Input  A  J  Sortfd-lUt  Out  Ult-1)  (Aflou  F -41  F-2  F-1M  F-1S1) 

If  the  terminating  subsegment  now  has  all  of  its  input  objects  bound,  it  is  ready  for 
application  and  we  proceed  as  above. 

If  a  data-flow  link  leads  from  a  subsegment  output  port  to  an  output  port  of 
the  main  segment,  then  the  object  assigned  to  the  subsegment's  port  is  transported  to 
the  output  port  of  the  main  segment  Assertions  and  justification  like  those  above  are 
created.  If  object  are  assigned  to  all  the  output  ports  of  the  main  segment,  then 
interpretation  of  the  plan  diagram  is  complete  and  an  output  situation  for  the  main 
segment  is  created.  If  the  plan  diagram  correctly  implemented  the  specs  of  the  main 
segment,  then  the  assert  clause  of  the  main  segment  should  be  provable  in  its  output 
situation.  Goal  statements  like  those  created  for  expect  clauses  of  a  subsegment  are 
created  for  the  assert  clauses  of  the  main  segment;  these  goals  are  justified  in  a 
manner  similar  to  that  above. 


Dependency  Directed  Reasoning 


6  A  Symbolic  Interpreter  for  Pian  Diagrams  111 


So  far  I  have  assumed  that  the  spec  type  of  each  sub-segment  is  known. 
However,  if  the  segment’s  spec-type  is  not  known,  but  rather  an  internal  plan  diagram 
is  provided  (i.e.  we  know  its  plan  but  not  its  specs),  then  the  sub-segment  is 
interpreted  recursively;  an  output  situation  and  a  set  of  output  objects  will  result 
The  interpretation  then  continues  as  'above.  If  both  the  specs  and  the  plan  diagram 
for  a  sub  segment  are  known,  REASoSl  first  uses  the  specs  in  interpreting  the  outer 
diagram  and  then  returns  to  the  inner  diagram,  symbolically  interpreting  it  and 
showing  that  its  specs  follow  from  its  plan  diagraia  As  we  will  see  later,  this  allows 
us  to  break  the  task  up  into  smaller  pieces;  in  the  case  of  recursions  and  loops  it 
provides  a  means  for  stating  a  "subgoal  invariant"  [Morris  &  Wegbreit,  1 977 J, 

A  subsegment  which  has  cases  presents  additional  complexity.  Like  other 
segments  the  segment  with  case-splits  may  have  expect  clauses  which  must  be  true  for 
the  segment  to  function  and  asserts  which  are  true  no  matter  which  case  is  takea 
These  are  called  the  case  0  clauses;  if  the  case-0  expects  cannot  be  proven  in  the 
segment’s  input  situation  an  error  has  been  detected.  If  the  case-0  expects  are  shown 
to  be  satisfied,  the  case  0  asserts  may  be  asserted  in  the  output  situation.  However, 
once  the  case  0  clauses  have  been  proven  it  is  still  necessary  to  show  that  at  least  one 
of  the  other  cases  is  valid.  This  is  done  by  iterating  over  the  cases  attempting  to 
prove  the  when  clauses  of  each  As  each  case  is  attempted,  REASON  sets  the  goal  of 
showing  that  each  »hcn  clause  of  the  case  is  provable.  If  these  goals  are  satisfied,  the 
case  is  applicable.  Before  starting  however,  it  assumes  that  the  case  is  inapplicable  so 
that  unless  a  proof  of  applicability  is  found,  REASON  prudently  assumes  that  there 
are  no  applicable  cases. 

Each  attempted  proof  of  a  when  clause  can  lead  to  one  of  three  results;  a  proof 
of  the  clause  could  lx*  found,  a  refutation  of  the  clause  could  be  found,  or  neither  of 
the  above.  In  many  practical  cases,  the  system  will  be  able  to  know  when  it  has 
reached  a  cave  of  unprovability;  typically  the  inputs  to  the  main  segment  of  a  plan 
diagram  are  not  highly  constrained  by  the  expect  clauses;  for  example,  a  segment 
might  expect  a  list  as  input  and  then  test  infernally  for  emptiness,  taking  different 
branches  for  the  two  possibilities  In  such  cases  the  system  can  determine  that  it 
cannot  know  whether  the  input  list  is  null  or  not;  instead  REASON  will  immediately 
engage  in  a  case-split  analysis  This  avoids  the  wasted  effort  of  attempting  to  make 
impossible  proofs. 


For  Complex  Program  Understanding 


112  A  Ssmbolic  Interpreter  for  Plan  Diagrams 

If  proofs  can  be  found  for  all  the  clauses  of  a  case,  then  the  case  is  applicable. 
REASON  asserts  the  case  to  be  applicable  with  a  justification  pointing  to  those  facts 
which  satisfied  the  when  goals.  The  assert  clauses  of  the  case  are  added  to  the  output 
situation  and  justified  as  above.  If  a  when  clause  of  a  case  is  refuted,  the  case  is 
declared  to  be  inapplicable  with  a  justification  pointing  to  the  refuting  fact  The  next 
case  is  then  tried. 

The  specs  used  in  REASON  for  case-splitting  segments  assumes  that  the  cases  are 
ordered  sequentially,  i.e.  c»st-r  can  only  be  considered  if  cust-i  is  inapplicable,  and 
similarly  for  the  remaining  cases.  Thus,  the  goal  of  showing  the  applicability  of  exse-r 
includes  a  subgoal  that  c»sc  1  is  not  applicable,  casi  j  includes  the  two  goals  that 
c»sc  i  and  c»si  i  are  not  applicable.  However,  REASON  will  not  attempt  a  case 
unless  it  already  knows  that  the  prior  cases  are  not  applicable  (or  unless  they  are  of 
unknown  app'ii ability,  as  I  will  discuss  next).  Therefore,  REASON  already  knows  the 
results  for  the  previous  cases  and  includes  these  in  the  dependencies  supporting  the 
assertion  which  invoked  the  conjunctive  goal  mechanism.  This  builds  up  a 
justification  structure  which  guarantees  that  no  more  than  one  case  can  have  its 
applicable  assertion  in  at  the  same  time.  Thus,  a  segment  with  three  cases  would 
have  justifications  like  the  following  (Note  wavy  lines  indicate  non-monotonic 
dependency). 


Dependency  Directed  Reasoning 


6  A  Symbolic  Interpreter  for  Plan  Diagrams  1 13 


CASE  -i 


case* 


case  - 1 


( Sat  me  itd  I  »ptc  til) 
(Satiif  ltd  E  •  ptc  l  ■  W) 


(  App  lie  tbit  Cm  )) 


(  Sal  i  if  ltd  E.ptcWlK 
(SttiiEnd  E«ptct-rr)N j 


( Appl ic tb<t  Cttt - I ) 


(Sttnf ltd  Euptct-Sl) 
(Stluf  ltd  E<ptct-)?h 


(Applictbit  Can-  I 


(•ot  (Applictbit  Ct>t  nt(bpt  J^AgPlictbTf  ctit  ^aot  (Applictbit  ClltO)) 

(Atfuttd  E  mpt^l  l^Fy]  y  (AatutTd  E  ■  pt^F^lfy  ~~~J  ^^(Atfuttd 

(Itfwttd 


(  *a f  ut td  E  >ptc t  W) 


(Or  (Applictbit  CAIA-l)16- 
( Appl u tbit  Ctit-E) 

( App  I  ic  tbit  Ctit  I  ))i 


(  Appl ic  tbit  Stg  1 1 
(bot  (Applictbit  Stg-l)) 


CASE  • 


(  Appl  ictblt  Cttt  (Applictbit  Cttt-*)) 

( Sat  itf  itd  E  iptct  *1  )  AfEuttd  Eiptct -II ) 


( Sat  1 1  f  <  td  t  *ptc  t  f 1 ) 


'•(itfwttd  Eiptct-m 


Support  Structure  For  Case-Splitting  Segment 
Notice  that  this  is  an  "and-or”  graph.  Individual  justifications  include 
dependence  on  several  facts  al  once  (conjunction  is  indicated  by  lines  joining  together 
•it  an  arrow)  while  several  justifications  may  independently  support  the  same  fact 
(disjunction  is  indicated  by  separate  arrows  pointing  at  the  same  fact).  Also  notice 
that  the  dependencies  guarantee  that  at  most  one  case  will  be  considered  applicable  at 
a  time. 

However,  the  normal  circumstance  is  that  each  case  (except  the  last)  will  have 
some  when  clause  which  can  be  neither  proved  not  refuted;  we  then  say  that  the 
clause  is  of  unknown  truth  value.  A  case  which  contains  no  refuted  clauses  and  at 
least  one  clause  of  unknown  truth  value  is  said  to  be  of  unknown  applicability.  These 
cases  reflect  the  possibility  that  a  test  might  sometimes  succeed  and  sometimes  fail, 
depending  on  the  input  data.  Since  we  are  interpreting  the  plan  on  symbolic  (typical) 
inputs,  most  segments  with  case  structure  will  have  cases  of  unknown  applicability. 


For  Complex  Program  Understanding 


114  A  Symbolic  Interpreter  for  Plan  Diagrams 


If  a  case  has  unknown  applicability  then  REASON  first  considers  the  possibility 
that  the  case  is  applicable,  assuming  that  all  its  clauses  of  unknown  truth  value  are 
true.  These  are  justified  by  a  non-inonotonic  dependency  structure  of  some 
complexity,  the  purpose  of  which  is  to  allow  the  system  to  select  between  the  cases 
later.  For  the  moment  I  will  ignore  ail  aspects  of  the  justifications  which  are  not 
related  to  the  case-splitting.  Suppose  that  cut  i  of  application  as  has  a  when  clause 
requiring  <>,  and  further  suppose  that  p  is  not  provable.  We  want  to  set  up 
justifications  so  that  we  may  easily  return  later  to  the  assumption  that  casi-i  is 
applicable,  bringing  in  the  assumption  that  p  holds.  Also  we  want  to  be  able  to 
switch  easily  to  the  assumption  that  some  case  other  than  cust-i  holds,  bringing  in  the 
assumption  <*ot  p).  Therefore  p  is  made  to  depend  non-monotonically  on  (hot  eg 
Also  REASON  creates  an  assertion  stating  that  it  should  consider  cxst-i  and  it  makes 
p  depend  monotonically  on  this  assertion. 


P  II*  (not  P)  ;  Not*  lack  of 

'111  (Select  Ceie-1  A))  ,  jutt if icet ton 

'  117  p  (Cote  Split  Auction  (l-lll)  ('-lit)) 

However,  this  is  not  all  that  must  be  done.  If  casc-i  is  not  to  be  considered  (as 
would  happen  when  we  go  on  to  look  at  other  cases)  we  would  want  (tor  r)  to  be 
brought  in.  Actually,  if  there  is  more  than  one  unknown  clause  then  we  want  to 
bring  in  the  disjunction  of  all  such  clauses.  Assume  there  is  a  second  when  clause  o  in 
c*st  i.  Then  the  justification  structure  would  look  like: 


'tit 

'  in 
Ml) 
'll) 
'114 

r  - 1  is 

'  lit 


(not  P) 

(Select  Ceiel  A)) 

P 

(not  0) 

0 

(Of  (not  P  )( not  0)) 
(Select  Ceie  '  A)) 


,  no  jui t if icet Ion 
.  no  juitl'wetion 

(Ceie  ipiit-Auipeption  ('-111)  ('-lit)) 
.  no  juitif tcetten 

(Ceie ■ * p 1  it -Aiivepl ion  ('-111)  ('-11))) 
(Ceie -Split- Auction  ('  ll*)  ('  ll!)) 
;  no  juitlflcotlon 


If  there  is  a  third  case,  then  '  -  ns  should  also  be  brought  in  whenever  this  case  is 
being  considered  REASON  adds  another  justification  to  '-ns  as  follows: 


'-ID  (Select  Ceie  )  A))  .  no  jwttif  nation 

'-I1S  (Or  (not  P  )( not  0))  ( C  eie  ■  Spl  1 1  •  Ai  iteopt  ion  ('-11')  ('-111)) 


Dependency  Directed  Reasoning 


6  A  Symbolic  Interpreter  for  Plan  Diagrams  115 


REASON  may  now  consider  any  case  simply  by  giving  the  appropriate 
(stuci  cam  i  An  assertion  a  justificatioa  When  it  is  through  considering  the  case,  it 
must  remove  the  justification  for  the  select  assertion  by  retracting  the  justification 
supporting  the  select  assertion.  Notice  that  if  no  select  assertion  is  justified,  then 
none  of  the  assumptions  are  in,  representing  the  most  general  case  where  we  have  no 
idea  which  case  holds. 

Once  REASON  has  assumed  all  clauses  of  unknown  truth  value  for  a  particular 
case,  it  will  ha\e  satisfied  all  the  case's  when  clauses*  The  case  will  then  be  declared 
applicable  and  a  justification  created  pointing  to  all  assertions  satisfying  any  of  the 
when  clauses,  including  the  assumptions  justified  by  the  select  assertioa  The  output 
clauses  of  the  case  are  then  asserted  in  the  output  situation,  each  being  justified  by 
the  assertion  declaring  the  case  applicable.  Thus,  the  logical  relationship  between  the 
assumptions  and  the  output  assertions  is  represented  explicitly  in  the  data-base. 
Finally,  if  there  are  conditional-control-flow  links  coming  from  the  current  case,  these 
are  declared  active  with  a  justification  pointing  to  the  assertion  which  declares  the 
case  applicable.  REASON  now  continues  evaluating  any  segments  which  terminate  the 
conditional-control  flow  links  leaving  the  segment  until  it  reaches  either  a  JOIN  or  the 
output  side  of  the  main  segment  of  the  plan  diagram. 

As  REASON  goes  along  the  paths  started  by  the  conditional-control-flow  links,  it 
records  which  cases  have  yet  to  be  evaluated.  Thus,  when  a  terminal  segment  is 
reached  it  returns  to  evaluate  the  remaining  cases 

It  begins  by  ounug  the  select  assertion  for  the  last  case  evaluated,  removing  the 
assertion’s  justification.  It  then  justifies  the  select  assertion  for  the  next  case.  This 
has  the  effect  of  assuming  the  falseness  of  at  least  one  of  those  clauses  from  the 
previous  case  which  had  unknown  truth  value.  The  inapplicability  of  the  previous  case 
follows  from  this  assumption  directly  (using  proof  by  cases  if  there  is  more  than  one 
clause  of  unknown  truth  value);  REASON  constructs  this  proof  recording  the 
appropriate  justifications.  Thus,  selecting  casi-;  will  out  the  applicable  assertion  of 

CASl-l. 

The  next  case  is  then  evaluated  with  its  select  assertion  irr,  ie.  REASON 
investigates  whether  the  next  case's  applicability  follows  from  the  inapplicability  of  the 
previous  case.  If  the  new  case  has  clauses  of  unknown  truth  value,  then  REASON 
proceeds  as  above,  creating  non- monotonic  justifications  for  the  current  case's 
applicability.  If  all  the  clauses  of  the  new  case  can  be  proven,  then  this  case  is 


For  Complex  Program  Understanding 


116  A  Symbolic  Interpreter  for  Plan  Diagrams 


declared  applicable.  If  the  case  has  a  clause  which  is  definitely  false,  the  case  is 
declared  inapplicable;  REASON  then  moves  on  to  the  next  case. 

If  the  last  case  has  clauses  of  unknown  truth  value,  then  the  segment  is  declared 
inapplicable  with  a  justification  pointing  at  a  statement  expressing  the  possibility  that 
the  unproven  clauses  might  be  false.  This  statement  is  justified  so  that  it  depends  on 
the  01/ mess  of  the  clauses  of  unknown  applicability;  if  something  is  changed  to  make 
these  clauses  definitely  true,  the  inapplicable  assertion  will  go  out.  If  the  last  case  has 
clauses  of  unknown  truth  value,  then  there  is  an  error  which  manifests  itself  as  an 
intermittent  program  bug;  if  the  input  data  happened  to  be  such  that  an  earlier  case 
would  succeed  then  the  program  would  work,  otherwise  it  would  fail  Such 
intermittent  bugs  are  among  the  most  distressing  problems  of  programming  and  it  is 
desirable  to  be  able  to  spot  them  through  the  process  of  symbolic  evaluation. 

•  Since  REASON  requires  both  that  at  least  one  case  be  applicable  and  that  cajc-s 
be  applicable  in  all  circumstances,  the  final  action  taken  in  evaluating  a  case-splitting 
segment  is  to  assert  the  disjunction  of  all  the  select  assertions  and  to  attempt  to  prove 
the  applicable  assertion  for  the  segment.  This  is  always  done  by  case-splitting  the 
disjunction  of  the  select  assertions.  Thus  if  there  were  three  cases,  we  would  have: 

f  Ml  (Or  ( Spite t  C»»» ■  1  A|)<Sp1«ct  Cltl  t  AJ)(St1tct  Ctlt-3  A))) 

t-MI  (Subgopi  (Antf  (Or  (Applictblt  Cbit-1  A))  (Applictblt  Cti t-1  AJ )  (Applicable  Cast-)  AS)) 
(Applicable  Cat*  I  A))) 
for  (  (  Appl  icabla  A))  ) 

1"  (  ) ) 

*  M?  (Show  (goal  (Or  (applicabla  ...))) 

by  (ipl'ltmg  ( OA  (iplpct  .  ))) 
for  ((Applicable  A3)  )  in  (  .)) 

This  proof  proceeds  trivially;  all  the  justifications  have  already  been  built  up  As  each 
select  statement  is  assumed  by  the  case-splitting  mechanism,  the  corresponding  case 
becomes  applicable  and  the  disjunction  in  r-sat  is  deduced  by  disjunction  introduction. 
If  the  last  case  had  not  been  found  applicable,  however,  then  the  appropriate  clause 
will  not  come  in  and  the  goal  will  not  be  deducible.  REASON  complains  that  it  has 
found  a  bug.  In  any  event  the  final  justification  structure  built  up  in  this  process 
looks  as  follows: 


Dependency  Directed  Reasoning 


For  Complex  Program  Understanding 


118  A  Symbolic  Interpreter  for  Plan  Diagrams 

When  a  case  is  declared  applicable  its  output  objects  are  propagated  along 
control  flow  links  just  as  for  non  case-splitting  segments.  In  addition  if  there  is  a 
conditional-control- flow  link  originating  at  a  case  then  it  is  active  whenever  the  case 
with  which  it  is  associated  is  applicable  A  segment  which  terminates  a  conditional- 
control-flow  link  is  ready  for  application  only  if  all  its  incoming  control-flow  links  are 
active. 

The  only  primitive  of  plan  diagrams  not  yet  discussed  is  the  join.  When  the 

control-flow  link  and  all  inputs  leading  to  a  case  of  a  join  are  available,  REASON 

creates  an  anonymous  object  to  stand  for  the  output  objects  of  the  join.  It  then 
creates  assertions  pairing  the  inputs  to  the  newly  created  outputs.  The  pairings  are  a 
set  of  ip  assertions  with  the  input  object  and  the  corresponding  output  object  as 
arguments.  Each  id  assertion  is  justified  by  a  stuct  assertion  stating  that  the  current 
case  of  the  join  is  active.  This,  in  turn,  is  justified  by  the  assertion  stating  that  the 
incoming  control-flow  link  is  active.  Thus,  a  specific  pairing  of  the  join's  output  to 
an  input  object  can  only  be  made  if  one  of  the  incoming  control-flow  links  is  active. 

However,  a  control  flow  link  terminating  at  a  join  must  trace  back  to  a  case  of 

a  segment  at  which  control  was  split  The  join  case  can  be  active  only  if  the 

appropriate  case  of  the  segment  at  which  control  split  is  applicable.  When  REASON 
first  evaluates  a  case  of  a  join  it  examines  which  segment  select  statements  are  iVr,  it 
then  justifies  these  by  a  pointer  to  the  case  select  statement  for  the  join  case. 
Selecting  a  join  case  then  brings  in  all  the  assumptions  relevant  to  the  particular 
control  path  which  terminates  at  that  case  of  the  join.  The  select  statements  for  the 
join  are  quite  similar  to  those  for  case-splitting  segments  and  may  be  used  in  proofs 
by  cases  to  prove  various  properties  of  the  output  objects  of  a  joia  This  is  useful 
since  it  makes  it  possible  to  state  properties  of  the  output  object  without  making  a 
commitment  to  which  case  is  active. 

When  control  reaches  a  join,  REASON  does  not  continue  interpreting  past  the 
join.  Instead  it  returns  to  any  case-splitting  segment  whose  analysis  has  not  yet  been 
completed.  This  will  make  other  control  paths  active,  activating  other  incoming  cases 
of  the  join.  Only  when  all  incoming  cases  of  a  join  have  been  activated  will 
REASON  pursue  the  paths  leading  away  from  the  join  Before  doing  so,  however,  it 
makes  sure  that  it  has  cleaned  up  the  evaluation  of  all  prior  case-splits  so  that 
evaluation  of  the  paths  leading  out  from  the  join  does  not  inadvertently  proceed  under 
the  assumption  that  only  a  single  case  of  the  case-splitting  segment  need  be 
considered. 


Dependency  Directed  Reasoning 


6  A  Symbolic  Interpreter  for  Plan  Diagrams  119 


When  all  interpretation  is  completed  REASON  attempts  to  prove  that  the  assert 
clauses  of  the  main  segment  are  satisfied.  Again  each  clause  is  translated  into  a  goal 
and  the  reasoning  mechanisms  of  the  previous  chapter  are  invoked.  If  the  proofs 
succeed,  justifications  are  built  as  before.  Thus,  when  all  the  goals  are  proved  a 
complete  dependency  network  is  built,  linking  every  satisfied  goal  back  to  the 
primitives  of  the  plan  diagram  upon  which  the  goal  depends.  These  dependencies 
point  to  data-flow  and  control-flow  links,  to  input  assertions  of  the  main  segment  and 
to  output  assertions  of  the  sub  segments. 

Each  such  proof  of  a  goal  can  be  categorized  as  either  a  pre-requisite  proof  or 
an  achieve  proof.  Pre  requisite  proofs  are  those  which  establish  that  a  sub-segment’s 
expect  and  when  clauses  are  satisfied.  Achieve  proofs  are  those  proving  the  assert 
clauses  of  the  main  segment.  If  these  are  summarized  to  remove  the  detail,  leaving 
only  the  connections  between  specs  clauses  and  flow  links  then  we  have  what  we  have 
called  purpose  links. 

We  see,  therefore,  that  a  symbolic  interpretation  in  REASON  leads  to 
considerabls  more  information  than  just  the  statement  that  the  program  does  what  is 
intended.  In  addition  to  this  data,  REASON  produces  a  complete  proof  and  a 
summary  of  this  proof  into  purpose  links  which  quickly  indicate  the  intermodule 
dependencies  in  the  program.  Furthermore,  this  data  is  so  organized  that  if  a  crucial 
spec  clause  is  changed  then  all  other  sub-segments  which  depended  on  this  clause  will 
be  declared  inapplicable.  This  change  of  spec  status  will  be  signalled  by  the  Truth 
Maintenance  S\stem  as  part  of  its  normal  ming  and  curing  of  facts.  REASON 
responds  to  these  notices  and  informs  the  user  of  the  exact  nature  of  the  problem 
caused  by  the  change  Purpose  links  provide  a  rapid  mechanism  whereby  REASON 
can  tell  without  a  deeper  analysis  that  a  proposed  change  is  not  safe.  Since  the 
purpose  links  tell  whether  a  spec  clause  is  used  in  any  proof,  all  REASON  must  do  is 
to  see  if  the  clause  is  involved  in  any  purpose  links.  If  so,  the  link  tells  which 
segments  are  affected  by  the  change. 


For  Complex  Program  Understanding 


1 20  An  Example  of  Symbolic  Interpretation 


Chapter  7:  An  Example  of  Symbolio  Interpretation 

To  show  how  the  feature*  developed  so  far  interact  in  the  analysis  of  • 
moderately  complicated  algorithm,  let  us  consider  how  REASON  interprets  a  routine 
for  computing  the  intersection  of  two  sets  represented  as  ordered  lists.  This  algorithm 
runs  in  linear  time  by  only  considering  the  heads  of  both  lists.  If  the  two  heads  are 
identical,  then  that  element  should  be  added  to  the  accumulation.  If  they  are  not 
identical,  then  the  smaller  element  cannot  also  be  a  member  of  the  other  list  Thus, 
it  can  be  thrown  away  and  the  iteration  continued.  One  possible  coding  of  this 
routine  is  as  follows: 

(Dqfun  fait- inttruct  (lut-l  lilt-7) 

(da  ((Ace  nil ) 

(Car-11  ail) 

(Car-17  mi) 

(Uid  1  Mil) 

(Uid  7  Mil)) 

((Or  (Mull  11lt-l)(Mu1l  1 1 1t  -  7 ) ) 

(tavaria  Acc)) 

(Sato  Car-11  (car  1 1  it  - 1 )  Car-17  (car  1lit-7)) 

(Satq  Uid-1  (»akriu»  car-11)  Uid-7  (aulntaa  car-17)) 

(Cond 

((tq  Uid-I  Uid  7)  (Satq  Acc  (coni  Car-11  Acc)) 

(Satq  lut-l  (cdr  lut-l) 

llit-7  (c«r  lilt-7))) 

((<  Uid- 1  Uid  7)  (Satq  tiit-1  (cdr  1 1st- 1 ) ) ) 

(t  (Satq  1 ist -7  (cdr  ltit-7)))))) 


As  we  mentioned  before,  the  LISP  code  is  first  analyzed  by  a  surface  flow  analyzer 
which  abstracts  out  many  of  the  details  of  surface  data  and  control-flow.  In 
particular,  sno’s  used  to  achieve  data-flow  are  translated  into  data-flow  links  and 
CON  Os  and  other  control  primitives  are  translated  into  case  structured  segments  with 
conditional  control  flow  links.  The  plan  diagram  given  to  REASON  for  analysis  is  the 
following: 


Dependency  Directed  Reasoning 


7  An  Example  of  Symbolic  Interpretation  1 2 


J 


* 

I 


Flit- 
Intersect 


For  Complex  Program  Understanding 


122  An  Example  of  Symbolic  Interpretation 
The  specs  for  this  routine  are  as  follows: 


(Dt'iptci  l«»t-  Iitltrucl 

(  Input  I  Ult-1  lut-t) 

<t«P«t  (Objpct  typ«  itit-l  l«rtH-int) 

(Objpct  typp  itil-i  Strip*- Mat)) 

(Output!  lilt-1) 

(illirt 

(Object - typp  lilt-1  Strip*- lilt) 

(Itr-tll  (  it) 

(*"<  (Np*ir  1  lit- 1  pl)(*pabpr  1t||-<  :•))) 

(Keubtr  ti|t-l  Pi)) 

tier- ill  (  e>) 

(Ne-*er  lut-1  el ) 

(»«<•  (He»6er  lut-1  :p1)(Hpabpr  tlit-l  :«!))))) 

These  specs  refer  to  descriptions  of  data  objects  which  I  have  not  yet  presented. 
These  details  are  developed  more  extensively  in  a  later  chapter.  However,  all  the 
notions  used  here  are  intuitive.  The  predicate  soarto-usr  means  that  the  list  is  sorted 
in  increasing  order  by  uaiout-nxsTir  in  (the  n«*u«  function  of  MacLisp).  The 
quantified  statements  in  the  specs  say  that  all  elements  of  the  intersection  are  in  the 
output  i^t  and  that  only  these  elements  are  in  the  output  list 

Actually,  the  inner  routine  ni  does  all  the  work  of  the  program,  and  is  called 
recursively.  This  means  that  we  have  to  give  a  specification  for  this  inner  routine:  In 
the  next  chapter  I  will  develop  a  method  which  will  remove  this  need  by  allowing 
REASON  to  recognize  parts  of  a  program  as  instances  of  standard  plans  whoae 
specifications  we  already  know.  The  specs  for  r n  are: 


Dependency  Directed  Reasoning 


7  An  Example  of  Symbolic  Interpretation  123 


(dtfiptci  f II 
( Input*  lilt)  Lilt-*  CA) 

((■pact.  (Objact-lyp*  lul  l  SorUd-lut) 

(  Objec  t • t/p»  li*t-X  Sorttd-litt) 

(Objtct-tye*  CA  R«v«ri*-Sertdd-lilt ) 

(F  or ■ a) 1  (  :  i ) 

(Or  (M«i»b«r  lllt-1  •)(H*n*«r  lut-X  :*)) 

(far-all  (  y) 

( Member  CA  y) 

(<  [Uniqut-ld  y)  [Umqut  ld  «])))) 

(Output*  * inq) • Accum) 

(Ai*prt 

( Objac t - t/pr  rmal'AccuM  Atvqriq-Sortqd-lllt ) 

(tor-all  ( :«) 

(And  (M«mb»r  lnt-1  : « )(Wqwb*r  liit-X  ;«)) 

(Mt*b«r  F  in»l  •  Ac  cun  :■)) 

(for  »'l  (  ■)  (Mn»t>ar  CA  >) 

(Mpwbtr  Ftnql-Accu*  «)) 

(Fora)  I  (  •)  (Ma«>b»r  Fin*)-AcciM>  •) 

(Or  (and  (Member  lilt- 1  «) 

(Member  tut-;  •  )) 

( Member  CA  >))))) 

Given  these  specs  for  hi,  it  is  an  immediate  consequence  that  fast-utirscct  satisfies  it 
specs,  fast- ipursict  calls  hi  with  its  two  input  lists  as  the  two  lists,  and  with  mi  as 
the  ca  input.  Since  nothing  is  a  member  of  mi,  the  second  quantified  statement  in 
the  asserts  of  » 1 1  is  vacuous;  similarly  the  third  quantified  statement  contains  a 
disjunction  whose  second  disjunct  is  vacuous  if  the  ca  input  is  nil,  the  other  disjunct  is 
exactly  that  required  by  fast-ixtmscct.  Similarly,  the  expect  clauses  are  met  simply; 
since  the  input  ca  is  ah,  it  is  a  vacuous  condition  that  all  elements  of  ujt-i  and  ust-x 
have  larger  uio's  than  the  elements  of  ca.  Finally,  m  produces  a  list  in  reverse  sorted 
order  which  is  then  reversed  by  fajt-iahajict,  producing  the  required  sorted  list  as 
output. 

V  { 

I  will  now  describe  the  actions  which  the  symbolic  interpreter  takes  in  evaluating 
the  above  plan  diagram  However,  going  through  all  of  the  details  would  be  an  overly 
cumbersome  exercise,  so  I  will  try  to  present  this  without  too  much  tedium  and 
repetition.  The  system  begins  by  creating  an  input  situation  and  anonymous  objects 
to  stand  for  the  inputs  to  the  program  We  will  call  these  s-m,  iist-j  and  ust-i 
respectively.  The  system  then  asserts  the  expect  clauses  of  fast- unwed's  specs.  This 
gives  us; 


For  Complex  Program  Understanding 


124 


An  Example  of  Symbolic  Interpretation 


f  •  1  ( (Objecl-typb  lut-l  Sotidlut)  S- In > 

f?  ((Objtct  type  lut-l  Sorted-lut)  S- In) 

Next  the  system  evaluates  the  segment  create-mi  which  simply  asserts  that  its  output  it 
«u.  REASON  names  the  output  situation  of  create-ml  create -nil-out,  thus  we  have: 

f-4  (Output  Creetenil  The -Rut  I  Object  Mil) 

E-S  (( Object • type  RiT  l«pty-liit)  Creete -Ni  1 -Out ) 

The  data  objects  are  next  moved  along  the  data-flow  links  to  the  input  ports  of  ru, 

the  routine  which  actually  does  the  work.  Since  there  are  no  side-effects  in  this 

program,  all  assertions  which  are  true  in  one  situation  will  be  true  in  all  succeeding 
situations,  (Side  effects  change  this  drastically;  I  will  discuss  the  problems  of  side- 
effects  in  greater  detail  later).  Therefore  the  following  facts  are  true  in  the  input 
situation  of  mi  which  REASON  names  s-ir-i: 

»  -»  <  Input  Ml  11  tilt-1) 

9-f  (  Input  Ml  LI  Lltt-I) 

f  8  (  Input  Ml  C*  Rtl ) 

f  •  (<Obj*ct- typ#  l»«t-l  SarUd-L  lit )  fit-in) 

f  la  ( (Object  •  type  l  tit  *  1  Sorted-Uil)  fll-ln) 

f  II  ( ( Object  -  type  Ril  (apty-l lit )  fll-ln) 

Mt  ((Object  typ#  »il  Hit)  Ml  In)  (Rlt-ll-lllt  f-Jl) 

As  mentioned  above,  REASON  has  the  specs  for  mi  already.  So  if  it  can  show 

that  the  expects  of  mi  are  satisfied,  it  can  use  the  asserts  directly.  These  expects  are, 

however,  direct  conclusions.  REASON  declares  this  invocation  of  mi  applicable, 
create  an  output  situation  mi  out,  and  adds  the  asserts  to  this  situation,  getting: 


Dependency  Directed  Reasoning 


7  An  Example  of  Symbolic  Interpretation  125 


F-15  (Output  r  1 1  F  tn«)  -  Accu*  f  >n«) -A<cu»-8) 

F- 16  ((Object  type  f met- Acci*»-I  A«<rtri«-SsrUd-lUl )  Fll-Out) 

f-17  (For-ptl  (  :«) 

((Member  Nil  >)  Fli-IN) 

((Member  F  in«l  -  Accvjn-8  «)  Ml-Out)) 

F - 18  (For  all  (  •) 

(And  ((Member  L  eat- 1  .)  FI1-IN) 

((Member  lllti  .)  Fli-IN)) 

((Member  Ftnal-Accvae  (  «)  F 1 1  OUT)) 

F - 19  (For-all  (  >) 

((Member  F  mat  -  Accio-I  ;«)  FU-OOt) 

(Or  (and  ((Member  leit-1  :»)  Fll-IN) 

((Member  lul-J  :>)  Fll-IN)) 

( ( Member  Nil  >)  Fll-IN))) 

The  outputs  now  flow-  to  the  reverse  segment  whose  only  effect  is  to  change  the 
object  typo  statement  above,  producing  a  sorted  list  instead  of  a  reverse  sorted  list 
REASON  can  then  immediately  show  that  the  desired  results  hold  in  the  output 
situation  of  fast. inubsict. 

However,  to  use  the  specs  of  the  internal  routine  fii  with  confidence,  REASON 
must  demonstrate  that  its  specs  follow  from  its  plan  diagram;  therefore,  it  creates 
anonymous  inputs  for  m  and  begins  to  symbolically  evaluate  the  plan  diagram  for  fii. 
REASON  names  the  two  lists  input  to  fii  ust-i  and  ust-r  (as  above)  and  the  current 
accumulation  ca.  The  expect  clauses  of  fii  are  asserted  in  the  input  situation  of  the 
current  application  of  mi: 

r-78  ( (ObJBcl  •  typ*  lull  Sort«d-lUt)  F||-|N) 

F-n  ( (Object - typd  I'll  r  Sort«d-lUt)  FII  IN) 

F-ir  ( (Obj*c  t  -  typ«  CA  Rtvtr s« - Sor t«d-l n t )  Fll-IN) 

F - 7 1  (For -all  (  a) 

(Or  ((Member  lut-1  •)  FIt-lN) 

((M*mb.r  lUl-r  .)  Fll-IN)) 

(Fpr-alt  (  y ) 

( ( Mambar  CA  y)  FI1-JN) 

( (  <  ( lib  i  out  Id  y)  (Un’aut-ld  :»))  Fll-IN))) 

Notice  that  situation  fags  have  (teen  added  to  the  quantified  statements  in  the  spec 
clauses  using  the  simple  defaulting  rule  that  clauses  mentioning  output  objects  are 
assigned  to  the  output  situation  of  the  segment.  REASON  draws  a  few  direct 
conclusions  from  the  above  assertions; 


For  Complex  Program  Understanding 


126 

All  Example  of  Symbolic  Interpretation 

f-2S 

( (Object-  type  list-1  tut)  MJ-JB) 

( l/pa- inherit 

f-M) 

f  2b 

(( Object  -  type  litt  2  tut)  Ml-IN) 

( typa-  inhant 

Mil) 

t -2t 

( ( Object  -  type  CA  tut)  Ml-IN) 

( type- inherit 

f-22) 

REASON  now  begins  the  symbolic  evaluation  of  mi.  The  data-flows  lead  to  the 
two  tests  which  must  be  evaluated  immediately  upon  entrance  to  mi.  The  first  of 
these  segments  tests  whether  list  1  is  nulL  REASON  concludes  that  there  is  no 
relevant  information  in  the  situation  mi-in.  It  creates  a  case-split,  assuming  in  one 
case  that  the  list  is  null  and  in  the  other  that  it  is  non-nulL  In  the  non-null  case  it 
must  evaluate  the  second  test  segment  where  a  similar  decision  is  made.  REASON 
now  has  three  conditional-control-flows  waiting  for  further  evaluation.  The  first  of 
these  represents  the  case  where  iist-i  is  null  The  second  represents  the  case  where 
neither  ust  1  nor  ust-i  is  nulL  The  final  case  is  where  list-7  is  null  but  ust-i  is 
not.  However,  the  first  and  the  third  cases  both  lead  to  join-i.  Both  cases  of  jom-i 
take  the  same  input  cuamn  *c<mn*uo*.  Since  this  input  is  available  the  join  can  be 
evaluated  immediately.  The  only  action  following  from  the  join  joib-i  is  a  second  join 
join  j.  This  join,  however,  cannot  be  evaluated  yet  since  it  has  another  input  which 
is  not  available.  REASON,  therefore,  returns  to  the  top  of  the  diagram  considering 
the  case  where  both  lists  are  non-nulL  This  case  leads  to  the  sub-segment  labeled 
sin -worn  m  the  diagram. 

sin  wo#*  is  invoked  after  both  tests  have  taken  the  non-null  branch.  REASON 
brings  in  the  assumptions  of  this  case,  making  the  following  assertions  active: 

I  it  ((Not  (Objacttypa  Lul  l  tmpty-l lit ) )  NIAl  VON*  IN) 

f  Jl  ((Net  (Object  typ*  lilt-*  I«pty-I1lt))  Bin  WON*  IB) 

There  are  now  four  segments  which  may  be  evaluated  immediately;  ca*-i,  ha*«un-i, 

c*r  and  w/unu*  ?.  The  two  cab  segments  create  output  objects  representing  the  first 
objects  of  each  list,  while  the  two  segments  create  objects  representing  the 

numbers  which  are  the  unioui  io's  of  the  first  elements  of  the  two  lists.  This  leads  to 
the  following  assertions; 

»  *•  ((lint  luta  Mnt-1)  C*R- 1 -OUT)  .  justifications  her*  indicate 

f  4|  ((lint  lfi(  t  Mnt-f)  CAB  l-OVI)  .  In#  tact  ntmt  at  the  correct 

f-4J  ((Unique-Id  Hri'  l  N^iea'-I)  WACBtBl- 1  -OUT )  ;  spec  cleutti  front 

f-41  ( (  Unique  *  Id  flrit-f  BueOar-l)  WAS  BUM  J  ■  OU  T  )  ;  aecN  ieg»ent'l  IpBCI. 


Dependency  Directed  Reasoning 


7  An  Example  of  Symbolic  Interpretation  127 


I  should  explain  that  my  notation  in  the  plan  diagram  has  been  somewhat  sloppy. 
cab  i  and  cab  i  are  Kith  segments  of  spec-type  cab;  it  is  the  specs  for  this  spec-type 
which  REASON  uses  and  similarly,  for  habiuhi  and  hahu*-;.  I  should  also  note  at 
this  point  that  the  system  makes  use  of  two  properties  of  uiiqui.ip’s  which  I  have  not 
yet  stated.  First,  unour  io’s  are  a  one-to-one  mapping,  so  that  if  the  umooi-io  of  one 
object  is  equal  to  the  umouc  10  of  a  second  object,  then  the  two  objects  are  identical 

Secondly,  since  dip's  are  numbers;  any  two  imp's  arc  either  equal  or  one  of  the  two  is 

greater  than  the  other. 

Following  the  evaluation  of  the  maaium  segments,  the  only  segment  ready  for 

evaluation  is  the  test  segment  labeled  •<>»  which  takes  as  inputs  wu*»tR  1  and  *uniti-i, 
anonymous  objects  representing  the  dip's  of  the  first  elements  of  the  two  lists.  The 
test  has  three  cases,  corresponding  to  the  possibility  that  the  two  numbers  are  equal 
that  the  first  is  smaller,  or  that  the  second  is  smaller.  REASON  decides  that  there  is 
no  evidence  available  to  decide  this  question  and,  therefore,  creates  a  case-split 

REASON  considers  the  first  case  first  getting  the  following  justification  structure: 

f  98  ((*01  (lqu«1  71)  «<>?  Out)  .  not*  no 

F-99  (S«)«ct  Cot*  I  •<>>)  ,  juillllctlltti 

t  it$  (ffquol  Bi**Dtr  ■  I  Huntser.?)  -JO-Out)  (CilB-tpl  (f-99)  (F-91)) 

This  triggers  REASON  to  conclude  that  mist  l  and  mist-;  are  identical 

r  -  111  (Id  Mrit-I  liritl)  (On*  to  On*  fill) 

The  conditional  control  flow  link  leading  from  the  test  •<)?  leads  to  the  segment 
labeled  ew  i.  The  data  flows  take  the  object  mist-i  to  the  cois  segment  as  one  input; 
the  object  ca  is  the  other  input  list- i  and  list. j  flow  to  the  two  cot  segments.  The 
specs  of  cois  say  that  it  produces  a  new  co«s  cut  whose  left  is  the  object  mist-i  and 
whose  right  is  the  object  bound  to  ca  which  is  known  to  be  a  list  Rules  in  the 
system  which  represent  the  definition  of  list  membership  make  several  inferences  from 
these  two  assertions: 


For  Complex  Program  Understanding 


128  An  Example  of  Symbolic  Interpretation 

f  199  (Output  Coni- 1  Tht-ntv-coni  Cl)  ,  Jgl t if ic it ion»  pointing 

I  299  ((loft  C  l  flrit-1)  CONS  1  OUT)  ;  to  tN«  (pOC  Cloultl 

I  29  1  ((Bight  C  l  Co)  CONS-1  Oul)  ;  would  go  horo  -  ihoy  oro 

f  292  ((Objtct  typo  C  l  Coni-ctll)  CONS-l-OUT)  ;  onlttod  for  ilmpHclty 

f  29i  ((Objtct  typo  C-l  Hit)  CONS- 1 -OUT) 
f  19*  ((Tint  C-l  lint  1)  CONS  1- OUT) 
f  29i  ((B*lt  C-l  Co)  CONS  l-OOt) 

(■»»  ((Wombor  c-l  firit-l)  CONS  1  OUT ) 
f  292  (for  oil  (  ■) 

((H**btr  Co  •)  CONS- 1- IN) 

((Hti»bor  C-l  >)  CONS  l  001)) 

The  origin  of  the  rules  which  make  such  inferences  will  be  explained  in  more  detail  in 
the  chapter  on  describing  data  objects. 

The  two  cob  segments  produce  the  obvious  output  assertions: 

f  292  ((Brit  I  rot  - 1  Boit-1)  COB  1  OUT) 
f  299  ((Brit  Hit  2  Brit  - 1)  COB  1  OUT) 

These  objects  now  flow  to  the  recursive  call  to  ml  So  far  I  have  not  mentioned  any 
checking  of  input  expectations  since  these  have  all  been  trivial,  mi,  however,  requires 
that  its  two  input  lists  be  sorted,  and  that  its  nccunuintion  input  be  sorted  in  reverse 
order  The^e  first  two  requirements  are  met  simply;  since  bcst-i  and  nut-*  are  the 
cos's  of  sorted  lists  they  themselves  are  sorted  The  condition  that  c-i  be  sorted  in 
reverse  order  is  also  met  quite  simply.  Its  cob  c»  is  a  reverse-sorted  list  and  input 
expectations  stated  that  mbst  1  has  a  larger  kmnuh  than  any  member  of  c*.  Thus,  ru 
is  applicable  and  its  output  assertions  can  be  added  to  the  data  base.  This  creates  a 
new  output  fiN»i  nccuh  i  and  three  quantified  statements: 


lilt-r.p  f-ZH  f-r*l  f-M«  f.*«) 
llll-np  l-NI  f-*M) 

Hit-ftp  f-/#J  f-Hl) 
iiii-*m  f  in  t-n *) 
lilt-ow  f -Ml  f-MS) 


Dependency  Directed  Reasoning 


7  An  Example  of  Symbolic  Interpretation  129 


F-i*9  ((Objfct-typf  hn»l  *tci»-l  Rtvtrtt  SoMtdliit)  Ml-l-Out) 

F -11#  (  «) 

( ( Hembtr  C-l  »)  Mi  l  l*) 

((Htmbfr  F  inti  -  Accu*- 1  »)  Mil  Out)) 

F -111  (For.fll  (  .) 

(And  ( (Memptr  ■)  Ml-l-l*) 

((H.mb.r  Int  i  ■)  f  11-1-  IN)) 

((M**b«r  I  m»1  -AcCUM- 1  «)  M1-1-0UT)) 

t-ui  (ror.«u  <  •) 

((Htwbtr  Fin«t-Accu»-l  «)  M1-1-0UT) 

(Or  (And  ( ( Mfmbtr  lfit-1  >)  Ml-l-l*) 

|(Ar»Mf  Aiat-1  >)  Mi  l  l*)) 

((K*«b»r  (  I  i)  Mi  l  l*))) 

This  output  now  flows  to  the  join  joi»-j  which  has  other  unavailable  inputs. 
REASON,  therefore,  returns  to  the  next  case  of  the  test  -of.  In  this  case,  it  assumes 
the  negation  of  the  improvable  clause  from  casc  1  and  then  attempts  to  prove  that  the 
when  clause  of  casi  ;  holds.  Thus,  REASON  assumes  that  the  uio's  of  the  two  objects 
m*st - 1  and  nisi  ;  are  distinct.  Since  the  uio  is  a  one-to-one  property  this  indicates 
that  n*st-i  and  m*st  ?  are  distinct.  Furthermore,  since  the  uio  is  a  number  and  since 
REASON  knows  that  these  two  numbers  are  distinct,  it  asserts  that  one  of  the 
numbers  must  be  larger  than  the  other.  The  following  assertions  result: 


F-199  ( St  1 f <  l  C*if  - 1  *<>M 

F-9S  ((*ol  (Iqufl  Nunbfrl  Nts«t>tr  1))  •Oil*)  (Cttt- Ipl 1 t - tlltftf t ItA  (F-Z99)  ( F  •  •• ) ) 

F-J9I  (Not  (Id  Firtl  1  MpiI/1)  (UIO  »ot  •  •  IN  t-lt  F-«J) 

F-J91  (Or  ((<  Nuabtr-l  linbir  1)  •<>?-!*)  (NufArop  f-99) 

((<  *i»»b»r  l  Nukbfr-1)  •Oi  l*)) 

REASON  now  attempts  to  prove  the  when  clause  of  casi-j,  however,  this  too  can  be 
seen  to  be  improvable.  It  then  sets  up  the  next  stage  of  the  case-split: 

r-MJ  (Stltcl  Cut  I  •<>>) 

F-J#«  ((Nat  (<  Ainbtr-l  Iwtir  1))  •<>»-!*)  (C  tit  -  Spill- At  u»>pt  ipn  (F-J9J)  (F-99  F-J99)) 

f  Ml  ((<  *u«Mr|  *uab*r.|)  *OM»)  (ClltSpI  It-Aiiiwptlpn  (F-299)(F-M4  F -  99 ) ) 

F-99  ( ( *ot  ((autl  *it*btr  1  «u-6tr.;))  .<>M»)  (Cflf  lpMt-tnwwtlpn  (F-MJ)(F-99  F-»H)) 


For  Complex  Program  Understanding 


130  An  Example  of  Symbolic  Interpretation 

Notice  that  the  justifications  are  so  set  up  that  (1)  if  either  case-i  or  case-j  is  selected 
the  assertion  r  se  will  be  in,  (2)  if  case-/  is  selected  r-s*s  will  be  in,  unless  there  is 
some  reason  found  to  believe  its  negation;  (3)  If  case  i  is  selected  e-ma  and  r  u  will  be 
in. 


The  assertion  »  >»s  makes  case-*  applicable.  REASON  asserts  that  case-i  is 
applicable  and  since  there  are  no  output  assertions  to  add,  it  follows  the  conditional- 
control  flow  link  from  case  /  to  the  segment  n  ;  which  takes  the  cdr  of  its  input 
list . i  and  then  calls  mi  recursively.  Notice  that  the  pre-requisite  conditions  of  rn  are 
met  trivially  in  this  case.  The  co«  of  usi-»  must  be  a  sorted  list  as  noted  earlier;  the 
second  input  is  him  which  is  known  to  be  sorted;  finally,  the  cumeih-accumuiamo  input 
is  ca  which  was  known  to  be  sorted  in  reverse  order.  Thus,  rn  is  applicable  within 
this  sub  plan  as  welL  The  output  assertions  of  this  application  of  rn  which  REASON 
names  mi-/  are  similar  to  those  above: 

r  11#  ((Object  t/P*  f  ’  •l-Accvn-E  »«»tri»Sortt«Uil )  Mil-Out) 
r-lll  ( r or  all  (  ■  ) 

l("r»b«r  CA  ■)  M1-1-1A) 

((M**btr  Mn#l  -  Acctan-l  >)  Ml -1-Out)) 
r-lll  (for  all  (  .) 

(And  ((HaaOtr  tait-1  ■)  flZ-Z-l#) 

((mtmttr  Uil-1  l)  Ml  /  I*)) 

((*r~btr  f  mat -Acciaa-1  a)  M/  Z  OUT)) 
r-IIJ  (for  all  (  a) 

( ( W«nb0  r  hnal-Accaa-l  a)  MZ-Z-OUT) 

(Or  (and  (l«*«b«r  (ait- 1  a)  Mi  l  l*) 

((H*«t>»r  mi-1  a)  f  11-1- IN) ) 

U**«d>tr  CA  a)  Ml-l-l*))) 

The  justification  of  these  assertions  which  I  have  omitted  for  brevity  points  back 
to  the  relevant  spec  clause,  output  object,  and  applicability  assertions.  The  output  of 
this  segment  leads  to  the  join  join  ?  which  is  waiting  for  another  case's  input 
REASON  now  turns  to  the  final  case  of  -or.  As  shown  above,  REASON  assumes  in 
this  case  that  it  is  false  that  i  is  smaller  than  «um#e*-z.  It  then  concludes  by 
disjunction  elimination  that  mwtri-i  is  smaller  than  auwta-i: 

*314  ((<  »un*>»r  l  NdMtfr-Z)  <»-0Ut)  (DUJ-E1I*  f-M4  f-J#Z) 


Dependency  Directed  Reasoning 


7  An  Example  of  Symbolic  Interpretation  131 


Actually,  I  have  been  taking  a  slight  liberty  in  the  justifications  I  have  shown  since  as 
each  assertion  is  moved  along  a  flow  link,  a  new  assertion  is  created  with  a  new 
situation  tag.  I  have  used  the  fact  name  of  the  original  assertion  in  these 
justifications  as  a  notational  convenience. 

In  any  event  »  ji«  is  all  that  is  needed  to  conclude  that  the  third  case  of  the  test 
is  applicable.  Control  therefore  flows  to  «vi  which  produces  assertions  similar  to  those 
of  »vr.  The  control  flow  link  from  *vj  to  joih-i  is  now  active. 

join  t  produces  a  single  output  object  which  is  the  join  of  the  output  produced 
by  nwi,  #w>,  and  bvj.  These  are  the  final  accumulations  produced  by  the  internal 
recursive  calls  to  mi.  REASON  names  this  output  of  joib-i  n.  This  output  then 
flows  to  jom  i  where  it  is  joined  with  the  output  of  jom-i.  Examination  of  the 
diagram  shows  that  the  output  of  jom  i  is  cubbcbt  acc,  the  input  to  the  outer  hi,  since 
cubbimt  acc  flows  to  both  cases  of  the  joia  Thus,  the  two  inputs  to  joi«-i  are  m  and 
cubsist  »cc;  RE  ASON  names  the  output  of  this  join  FINAL-ACCUM-Oi  The  plan 
diagram  specifies  that  this  is  the  output  of  the  main  segment  ml  Symbolic 
interpretation  is,  therefore,  complete  and  REASON  now  tries  to  prove  the  asserts  of 
the  main  segment. 

There  are  three  things  to  be  proved:  (1)  All  elements  of  the  intciscction  are 
accumulated  (2)  Nothing  is  lost  from  the  cubbibt-bccumoutiob  input  (3)  Nothing 
extraneous  is  accumulated.  I  will  show  the  proof  of  first  of  these  claims.  This  is 
stated  formally  as  follows-- 

(tor-all  (  •) 

(And  U*r«*)er  L  Sat  - 1  «)  till*) 

U,t  l  ;«)  Mi  l*)) 

((H»*b»r  linil-tccial  •)  HIOUT)) 

To  begin  the  proof  of  this  statement  REASON  creates  an  anonymous  object  to  stand 
for  the  variable  of  the  quantified  statement  and  then  assumes  the  antecedent  clause 
with  this  anonymous  object  substituted  for  the  variable. 

»-!•••  (And  Ult-1  Objl)  HI-IB) 

(<PW<r  lift  1  Obj-n  Ml-  IN)  ) 


For  Complex  Program  Understanding 


132  An  Example  of  Symbolic  Interpretation 

REASON  also  establishes  the  sub-goal  for  the  quantified  statement: 

r-w  (Goat  ((NoaRor  f inol • Accuo-f  0RJ-1)  fll-OUT)  ( Ach1o»o-*oo)  ...)) 

for  ((for-ott  ( :*) 

(And  ((NoaOar  lllt-l  .«)  fll-IR) 

IIIMir  l»»t-l  »)  HI  - 1*0) 

( (Wiittr  r mot -Acci*i-S  »)  f !i -Out ) ) 
in  ((And  ((NwnOor  lllt-l  ORj-l)  fll-IR) 

((Nondtar  I1|t-|  ORj-1)  fll-IR)))) 

The  antecedent  of  the  quantified  statement  is  then  expanded,  yielding  the  two 
conjuncts: 

f-IRRI  ((NooRtr  lilt-1  0R)-1)  fll-IR) 
f-IRRI  (<H*«Ror  l»|t-l  ORj-l)  Mi  ll) 

however,  rules  relating  to  list  structure  conclude  from  these  that  both  lists  are  not 
empty. 


f-IRRI  ((Rot  (OR)Oct-typo  lllt-l  l*Rtjr-lHt )  f|l-IR))  (lUt-Rdf  f-IRRI) 
f-IRRA  ((Rot  (OR  joct  -  typo  lilt-1  (■dtjr-lllt)  Hl-IR))  (lul-dof  I-1RSI) 

This  brings  in  the  applicable  assertions  for  the  non-null  cases  of  the  two  Run?  tests, 
which  in  turn  causes  the  conditional-control-flow  link  from  the  ruu>  test  to  rial -work 
to  be  declared  active.  This  in  turn  brings  in  the  assertion  saying  that  casc-i  of  joir-s 
is  applicable;  the  output  of  joir  j  is  therefore  now  declared  to  be  io  to  the  output  of 
RtAi  woRi  which  is  ia.  This  triggers  the  identification  mechanisms  to  create  a  new 
subgoal  in  which  mraiaccuwr  is  replaced  by  »*. 

f-tl  (Goo)  (<n*-R«r  fA  OR)  1)  FI)  OUT) 
for  ( ( f  or  -  oil  (:«) 

(And  ( (WowRtr  lllt-l  :■)  fil-IR) 

( (noMfcor  Hit-/  :«)  fll-IR)) 

(|*0«M>or  fA  :.)  »  II -OUT))  ( AcMdvt-RROT  1)) 
in  ((And  ((H«M>or  lilt  )  OR)-))  fll-IR) 

( (M««Ror  lllt-l  ORJ-I)  fll-IR)))) 

The  data  base  is  now  quiescent.  REASON  next  expands  the  antecedents  of  the 
quantified  statement 


Dependency  Directed  Reasoning 


7  An  Example  of  Symbolic  Interpretation  193 


»-IMS  (Or  ((first  Uit-1  Obj-1 )  Ml- IN)  (F-1NN1  l  lit mttlbn ) 

((Hrmferr  (R.U  1 1st  - 1]  Otj-1)  MI-IN)) 

F-1NN6  (Or  ((Mr»t  t»»t-l  Obj-1)  Ml-IN)  (F-1NI1  I1|t-H«M-N«f  mil  1»n) 

((Mf*b«r  [#ii|  Liit  l]  Obj-1)  Ml-IN)) 

The  reference  expressions  in  both  expressions  can  be  resolved  since  f-m;  and  f-mn 
state  what  the  risi  of  each  list  is.  Notice  that  although  these  facts  are  tagged  with 
situation  tag  cor  i  out,  there  are  no  side-effects  in  this  program  and  all  assertions 
except  those  involving  newly  created  objects  may  be  moved  back  through  any  segment 
to  the  beginning  of  the  program.  (REASON  has  a  different  fact  name  for  the  same 
facts  in  the  initial  situation,  however,  for  simplicity  of  presentation  I  am  ignoring  this 
detail).  We  obtain: 

r-IMf  (Or  ((first  tilt-1  Obj  1)  Ml-IN)  ( Rtf trtnci -Itio lul ion  FINNS  f-JNF) 

((M»«>b«r  Rut •  1  Obj  1)  Ml-IN)) 

f  INNS  (Or  (((  ir,t  lull  Obj-1)  Ml-IN)  ( Rtfinnci -Rtiilul  ion  FINIS  F-MS) 

(<He*b»r  Mil  I  Obj  1)  Ml-IN)) 

There  are  now  several  strategies  which  REASON  might  pursue  It  could  attempt  a 
proof  by  cases,  splitting  either  of  the  above  disjunctions  (f-inn;  or  f  inis)  or  the 
disjunction  of  select  statements  associated  with  either  the  test  •<>»  or  the  join  join-*. 
The  current  version  of  REASON,  has  a  preference  for  splitting  disjunctions  arising 
from  the  goal,  rather  than  case  split  or  join  oriented  disjunctions.  REASON  attempts 
to  show  the  goal  »  «i  by  splitting  »  ink.  It  first  assumes: 

r  -  INN*  ((Firil  lull  Obj-1)  Ml-IN)  ( Cut  -  Ipl  it  -  llim*t  l»n  F  -  INN  X  ..  ) 

However,  the  system  has  already  asserted  fin  which  states  that  the  first  of  n»t-i  is 
Firit  i.  Thus,  an  identification  is  made,  leading  to  the  following: 

F  -  1*1*  (IN  F  ir«t  1  Obj  I )  (FtMl-IN  F-llll  F-NN) 

F-lill  ((Unique  l«  Obj  I  N^.btr-1)  Noknt*  1-Out)  ( Ifent If Icit ten  F-llll  F-Ni) 

Now  the  system  chooses  to  case-split  the  second  disjunction  f-inii.  It  obtains 


For  Complex  Program  Understanding 


134  An  Example  of  Symbolic  Interpretation 


F-1117  ((Fir  it  lt.tr  Obj  1)  fill*)  (Ca>«-  >p)  <  t  -  aotaipt  ion  F-llll  ...) 

Mil)  (Id  Mr.l-r  Obj  1)  (Part*- Id  F-1117  (•()) 

Midi*  ((Umqut  Id  Obj  1  Ntaobtr-r)  Hakntaa- 1  -Out )  (  Idant  if  teat  lan  F-llll  F-44) 

This,  in  turn  creates  another  identification: 

F  JO  1 S  (  Id  N**b*r-1  Ntwbtr-7)  ( F  gne  ■  Prap- Id  F-llll  F-1114) 

However,  miis  means  that  casi-i  of  the  test  •<>?  is  applicable.  Thus,  the  assertions 
pertaining  to  *w  1  come  in  since  this  is  the  segment  which  follows  from  cm-i. 
Furthei  identification  follows: 

F-llll  ((H*«b»r  C-I  Obj-l)  cons  1  OUT)  ( Idant  if  itat  tan  F-7II  F-llll) 

This  last  assertion  interacts  with  the  quantified  statement  r-m  to  create  the  new 
assertion; 

F  -  II I  F  ((Haobar  F  mat  -  Accua- 1  Obj  I)  Fill-Out)  (Far-All  Fill  F-llll) 

Finally,  since  the  control- flow  coming  into  casi-i  of  join-7  is  active,  the  assertion 
stating  the  applicability  of  this  case  comes  in.  This,  in  turn,  brings  in  an  10  assertion 
stating  that  final  accum  i  is  identical  to  the  output  of  join-*  which  is  fa.  This  triggers 
another  round  of  identifications 

Fine  ( Id  F  mal-Accua-l  FA)  (Jam-Salact  -..) 

F-llll  (<Hr*»b»r  FA  Ob)  1)  Fill  OUT)  ( Idant  If  Icat  ion  F-1I1F  F-llll) 

This  assertion  then  passes  directly  through  join-j  and  join-s  satisfying  the  desired  goaL 
However,  this  was  only  the  first  case  of  the  second  case-split  REASON  now  revokes 
the  assumption  r-nu  and  makes  the  assumption  that  oij-i  is  a  member  of  itst-t,  the 

CON  Of  lilt-7. 

P  -1171  ( ( Hnmbar  iait-7  Obj-l)  F 1 1  •  in )  (Caia-iplit-aiitaaptlan  F-llll  ...) 

Notice  that  all  the  identifications  triggered  by  the  assumption  r-iiir  are  now  out,  since 
that  assumption  has  been  cured  by  the  proof- by-cases  mechanism.  However, 
REASON  still  has  the  assertion  that  oij-i  is  identical  to  finst-i  since  it  has  not  yet 
revoked  that  assumption.  Rules  relating  to  list  structure  trigger,  this  time  concluding 
that  the  uniodi  io  of  oij-i  is  greater  than  that  of  fiist-7  since  it  is  a  member  of  the 
con  of  a  sorted  list  of  which  p  ikst-i  is  the  can. 


Dependency  Directed  Reasoning 


7  An  Example  of  Symbolic  Interpretation  135 


F-l*J>  ((>  (Uniqut-K)  Obj-IJ  [Unique  -  Id  Mrit-JJ)  MM<)  (LHt»  r-Ull  HII  F-41  F-M) 

The  reference  expressions  in  (he  above  assertion  need  to  be  resolved  and  both 
referents  are  available.  We  get 


r-i»r/  ((>  »u<<>t>«r-i  n^t*r m  t*)  (»#f-»#ioii/t»pn  t-tnt  run  f-4j) 

Rules  reflecting  knowledge  about  numbers  and  the  one-to-one  character  of  the  uto 
now  trigger 

F.JfiJ  ((Not  (<  Number  1  Ni**.b«r-1))  Ml-IN)  (Ntw-Prop  f  - 1(22 ) 

t-III«  ((Not  ((quel  Number-!))  fll-IN)  ( Nt*»  Prop  F  -  HH ) 

F-l|}$  (Not  (Id  Flrit-?  0t>  J  - 1 ) )  (UlO-Prpp  F-1IM  F-1111  F-41) 

Assertions  f-i»;s  and  f-i#;j,  however,  imply  that  c»st-i  and  exst-t  of  the  test  •<>»  are 
inapplicable  and  r-isn  implies  that  c*st-j  of  the  test  is  applicable.  Therefore,  «wi  is 
the  segment  to  which  control  is  transferred.  However,  during  the  symbolic  evaluation 
of  »vj,  a  quantified  statement  was  created  which  stated  that  any  object  which  was 
both  a  member  of  list  1  and  mil  is  a  member  of  final-accuh-i,  the  output  of  the 
internal  call  to  mi.  We  obtain: 

F-1SJ6  ((Hombqr  Mnpl-AcctPN  I  Obj-l)  NVI-OL/T)  (For-AII  F-U2I  F-1149  ...) 

As  abo\e,  this  parses  through  the  joins  directly  and  the  desired  goal  is  achieved  in  this 
case.  Thus,  REASON  has  finished  proving  that  if  the  object  o»o-i  is  the  first  object 
of  i  1st- i  it  will  be  a  member  of  the  final  output  REASON  now  considers  the  other 
case  of  the  disjunction  mu.  It  assumes: 

r-lt/r  ((M'mb«r  p«it  I  Obj  1  )  Ml  IN)  (Colt  Spill  F-1IU  ) 

which  triggers  a  set  of  deductions  similar  to  those  which  followed  from  the  assumption 
f-i*/*.  REASON  again  decides  to  try  proof  by  cases  on  the  disjunction  f-ims.  The 
first  case  of  this  proof  brings  back  in  the  assumption  r-nif  which  states  that  the  finst 
of  list  /  is  oej-i.  This  selects  c»si  i  of  the  test  »<>f  and  rules  out  the  others  as  above. 
The  quantified  statement  produced  by  symbolic  evaluation  of  the  internal  call  to  fm 
within  is  triggered  just  as  above,  leading  to  the  desired  conclusion.  T  his  leaves 
only  one  final  case  to  consider.  REASON  brings  in  the  assumption  F-isrs,  stating  that 
o*j-i  is  a  member  of  the  con  of  list-*. 


For  Complex  Program  Understanding 


136 


An  Example  of  Symbolic  Interpretation 


REASON  now  has  the  following  four  facts  in 


»  10»1  ((Mrobtr  Hit- 1  06j- 1  )  Ml-IN) 

»-lM*  ((Nf*6r  liiU  Otj  l)  Mi  ll) 

f  ieri  (  <  W*mb»r  tut  l  Obj-  1  )  f  1  I  -  ID) 

r  tar?  UMembtr  B..t-i  obj-i )  rn-ia) 

The  desired  goal  has  not  yet  been  obtained,  so  REASON  finally  resorts  to  splitting 
the  disjunction  associated  with  joia-r.  First  it  assumes  that  c*si-i  of  the  join  is 
selected,  mm g  the  assumption  that  c*sr-i  of  the  test  -of  is  selected  Thus  ewi  is 
active.  However,  associated  with  its  internal  call  to  rn  is  a  statement  that  any 
member  of  both  »csr  i  and  «sr  i  will  be  a  member  of  the  output  This  will  satisfy 
the  •mb  goal  as  we've  seen  above.  Selecting  c«t-i  of  joih-j  will  take  control  to  awl 
which  says  that  any  object  which  is  in  «st-i  and  ust-r  will  be  a  member  of  the 
output.  Finally,  Selecting  cxstJ  will  lead  to  rvj  and  the  requirement  that  the  object 

be  in  l ist - 1  and  »tst-r.  Since  o*j-?  satisfies  all  of  these  requirements  it  is  a  member 

of  the  output  of  JOIN-2  in  all  cases.  Thus,  the  proof  is  complete 

The  proofs  of  the  other  goals  follow  along  similar  lines,  involving  no  mechanisms 

other  than  those  shown  so  far.  The  proof  shown  above  was  constructed  by  the  first 
implementation  of  REASON,  although  some  technical  details  were  different  In  the 
next  two  chapters  I  will  turn  to  the  issues  of  categorizing  standard  plan  fragments 
which  motivated  the  new  implementation  effort 


Dependency  Directed  Reasoning 


8  The  Temporal  Viewpoint  137 


Chapter  8:  The  Temporal  Viewpoint 

The  problem  with  the  proof  given  in  the  last  chapter  is  that  it  involved  a  lot  of 
hard  work  proving  things  which  most  programmers  would  recognize  as  examples  of 
things  they  already  know.  This  puts  a  premium  on  the  recognition  of  pre-proven 
“plan  fragments".  Most  previous  research  on  the  use  of  pre-proven  schemata  such  as 
that  of  (Gerhart,  1975)  has  relied  on  syntactic  templates  and  correctness  preserving 
transformations  on  the  program  syntax  In  contrast,  the  Programmer’s  Apprentice 
represent  its  knowledge  of  standard  programming  techniques  in  the  plan  formalism, 
using  data-flow  and  control-flow  links  to  abstract  away  from  the  syntax  of  the 
programming  language.  In  addition,  we  use  symbolic  evaluation  and  a  situational  logic 
to  talk  about  the  internal  states  of  the  computation. 

Two  distinct  segmentations  of  a  program  can  be  made,  each  revealing  different 
aspects  of  its  teleological  structure.  The  first  is  a  segmentation  of  the  surface 
features,  called  the  surface  viewpoint  which  abstracts  out  the  communication 
primitives  of  the  programming  language;  the  second  expands  the  program  into  a 
sequence  of  situations  which  are  regrouped  into  the  temporal  viewpoint.  This  technique 
allows  programs  to  be  described  and  catalogued  in  a  high-level  vocabulary  which  is 
suitable  for  use  as  a  very  high-level  programming  language  or  as  a  command  language 
to  a  programmer’s  apprentice  system. 


Section  8.1:  A  Paradigmatic  Example 

Consider  the  following  LISP  routine  which  traverses  a  tree  and  builds  a  list  of  its 
leaf  nodes  (i.e.  it  is  a  mur.i  program): 

(difmi  fringi  ( tr»»  )(  f  r  Ingt  - 1  Uii  Ml)) 

(difmi  fringt-1  ( current -nod«  iccvdMjtit ion ) 

(cor'd 

( <  tilt- Wif  cufr»nt-nodf)(con|  currtnt  -nodi  iccuaul  it  i*n) ) 

(t  (frlngt-l  (lift  current  -  nodi  ) 

(fringi  1  (right  currtnt -nodi ) 

•cciMulttion))))) 


For  Complex  Program  Understanding 


138  The  Temporal  Viewpoint 


THE -t PEE 


THE- RCCUnULfll I  on 


E PI  MCE 


T  THE-FPIMGE 

The  program  might  be  paraphrased  as  follows  If  the  cumint-nooi  is  a  leaf  node 
then  cons  it  onto  accumulation,  the  current  list  of  leaf  nodes;  if  it  is  not  then  add  to 
accumulation  all  those  leaf  nodes  which  are  daughters  of  the  right  node  of  the  cumimt- 
nooi.  Then  add  to  the  result  of  that  computation  all  those  leaf  nodes  which  are 
daughters  of  the  left  node  of  the  cuwnt -nooi.  If  started  with  the  mt  as  the  cumest- 
nooi  and  nil  as  the  accumulation,  the  program  will  build  a  list  containing  exactly  the 
leaf  nodes  of  t«u. 


Dependency  Directed  Reasoning 


8.1  A  Paradigmatic  Example  199 


Nonterminal 

Nodes 


Terminal  or 
Leaf  Nodes 


A  Binary  Tree 


A  standard  proof  of  correctness  of  the  above  program  would  closely  follow  the 
above  description,  using  induction  to  argue  that  the  right  recursive  call  accumulates  all 
the  leaf  nodes  of  the  right  branch  and  then  that  the  left  recursive  call  accumulates 
the  others.  However,  such  a  proof  does  not  make  use  of  knowledge  which  is  second 
hand  to  most  LISP  programmers.  The  r*i»Gt  program  follows  a  standard  pattern  of 
double  recursion  on  the  branching  structure  of  the  tree.  In  this  case  the  standard 
tree- recursion  is  augmented  by  the  presence  of  (1)  A  cons  and  (2)  A  second  argument 
in  the  function  definition  of  must -t;  the  purpose  of  this  argument  is  to  accumulate 
the  set  oi  leaf  nodes.  Were  we  to  ignore  these  extra  features,  we  would  be  left  with 
a  program  which  does  nothing  but  traverse  a  tree. 


Irtvirittlrti  (tr»») 

(tend  ( ( teit- le»f  tr«t )) 

(t  ( t r«v*f »»• »• lr»t  (|»ft  tr«*)) 

( tr«»«r it - *- tr#»  (right  trtt))))) 

Although  the  logic  underlying  this  code  is 'a  cliche  of  LISP  programming,  it  is 
not  possible  to  specify  this  segment's  behavior  using  standard  input/output  descriptions. 
Indeed,  *-r #tr  produces  no  outputs  at  all,  and  thus,  has  no  I/O  behavior  to 

specify.  However,  the  segment  does  have  useful  temporal  behavior  during  its 
computation  it  visits  every  node  of  the  input  tree.  Secondly,  during  its  computation 
r**vt#st  »  Tsir  filters  the  nodes  of  the  tree  into  leaf  nodes  and  non-leaf  nodes,  creating 
a  set  of  control  states  in  which  precisely  the  leaf  nodes  are  available.  Thus,  the  I/O 
descriptions  of  Hoarc  |Hoare,  1969)  logic  are  inadequate  for  this  purposes  and  a  logic 
of  greater  strength  is  needed.  Those  interested  in  the  logic  of  computer  programming 
are  now  studying  such  logics  (Pnueli,  1977]  (Pratt,  1978]  although  with  other  purposes 


For  Complex  Program  Understanding 


140  The  Temporal  Viewpoint 
in  mind.  _ - — - 

If  we  ignore  the  tree  traversal  part  of  fringe,  focusing  our  attention  on  the  rest  of 
the  axle's  behavior,  we  can  give  an  equally  simple  characterization  of  the  temporal 
behavior  of  this  fragment  of  the  code.  Having  noted  that  trie- traversal  produces  a  set 
of  control  states  in  which  the  program  visits  the  leaf  nodes,  we  may  then  further 
observe  that  the  rest  of  the  program  acts  in  these  control  states.  In  each  such  state 
there  is  a  cons  operation  which  adds  the  current  leaf  node  onto  the  current 
accumulation.  This  new  accumulation  is  then  passed  on  to  either  the  next  occurrence 
of  a  control  state  in  which  a  leaf  node  is  visited,  or  if  there  are  no  further  leaf  nodes 
the  accumulation  is  passed  out  of  the  program  as  the  answer.  This  process  of 
sequentially  accumulating  additional  values  is  also  a  cliche'  of  LISP  programming  which 
I  will  refer  to  as  sequential  cons- accumulation. 

This  leads  to  a  different  view  of  the  fringe  program.  We  may  now  regard  it  as  a 
"composition"  (in  the  sense  of  functional  composition)  of  a  mi- traversal,  a  leaf -filter, 
and  a  sequential -consaccuhulamon  In  contrast  to  normal  compositions  which 
communicate  by  passing  a  set  of  data-objects  each  of  which  exists  as  a  unified  object 
at  the  time  of  the  functional  invocation,  this  composition  instead  passes  temporal 
collections  of  values  which  can  be  regarded  as  a  unified  object  only  by  abstracting 
away  from  the  program's  sequential  behavior.  This  view  of  a  recursive  program  as 
being  cottifxised  of  a  generator  together  with  a  consumer  is  used  in  the  languages  CLU 
(I  iskov  et.  at,  1977]  and  AI  PHARD  [Wulf  eL  aL,1976]  and  is  the  basis  of  both  the 
language  API  [Iverson,  1962]  and  the  loop  analyzer  used  in  the  programmer’s 
apprentice  project  [Waters,  1978J. 

It  would  be  desirable  to  be  able  to  construct  the  following  (more  natural)  proof 
of  the  fringe  program:  First,  we  already  know  that  tree-traversal  visits  every  node  of 
the  tree  and  that  it  filters  out  all  but  the  leaf  nodes  Second,  each  such  node  is 
passed  to  sequential  cons  AccimATioN.  Third,  we  already  know  that  sequential-cons- 
ac c uHut  a t i on  will  return  a  list  of  exactly  those  objects  which  it  was  passed  as  inputs. 
Finally,  since  the  output  of  sequential- cons -accumulation  is  the  output  of  fringe,  it  follows 
that  the  r rinse  program  produces  a  list  of  exactly  the  leaf  nodes  of  its  input  tree. 


Dependency  Directed  Reasoning 


8.]  A  Paradigmatic  Example 


Temporal  View  of  Fringe  Program 
Tree  Traversal  and  Sequential-Cons- Accumulation 

The  advantage  of  this  method  is  that  we  can  make  use  of  previously  constructed 
proofs  of  the  properties  of  and  stoucmi*i-co«s xccuhuimio*.  however,  in 

order  to  reap  this  advantage  we  will  have  to  construct  rules  of  inference  which  allow 
us  to  prove  temporal  properties  of  program  segments  and  which  tell  us  when  it  is 
allowable  to  apply  these  properties. 


For  Complex  Program  Understanding 


142  The  Temporal  Viewpoint 

In  previous  chapters  a  formalism  was  developed  for  describing  programs  while 
abstracting  away  from  the  primitives  of  the  programming  language,  using  instead  the 
notions  of  segmentation,  data,  and  control-flow.  I  have  also  presented  a  symbolic 
interpreter  for  these  descriptions.  In  designing  this  symbolic  interpreter  I  was  careful 
to  represent  explicitly  many  items  which  had  only  implicit  representation  in  my  earlier 
system  [Rich  &  Shrobe,  1 Q76J.  This  more  explicit  representation  was  created  so  that  I 
could  easily  define  and  discuss  program  properties  other  than  simple  I/O  behavior  and 
so  that  program  fragments  which  are  characterized  by  their  temporal  behavior  can  be 
catalogued. 

Ihe  symbolic  interpreter  used  in  REASON  defines  the  basic  notions  needed  to 
build  temporal  descriptions  by  connecting  segments,  situations,  and  objects  into 
applications  (an  application  is  a  segment  together  with  its  input  and  output  mappings, 
as  well  as  its  input  and  output  situations).  As  the  interpreter  goes  through  its 

symbolic  evaluation  of  a  plan  diagram  it  records  the  applications  which  it  encountered 
in  that  process.  Since  we  are  interested  in  the  temporal  properties  of  a  segment,  one 
of  our  main  goals  will  lie  to  discover  exactly  what  applications  occur  within  a 
particular  segment's  execution;  knowing  this,  we  will  be  able  to  group  these 
applications  into  the  iemiior.il  viewpoint  segmentation.  However  I  must  first  make  a 
few  additional  observations  about  the  nesting  and  grouping  of  applications. 

Intuitively,  the  application  *  of  uc*r*t  i  occurs  during  the  application  •  of 

stranm  ;  if  while  interpreting  *  the  interpreter  will  encounter  an  input  situation  in 
which  it  applies  str-mm  i  to  some  set  of  inputs  siGMtmi  together  with  the  set  of 
inputs,  the  set  of  outputs,  the  mappings  of  these  to  the  input  and  output  names  of 
st&«*«n  i,  and  the  input  and  output  situations  constitute  the  application  a. 

More  formally,  we  say  that  an  application  A  of  segment-/  occurs  within  an 

application  B  of  segmcnt-2  if  the  following  conditions  hold; 

(a)  The  segment  of  application  a  is  sctmai-i  and  the  segment  of  application  •  is 

Si6*«tm  i. 

(b)  stun  mi  is  a  subsegment  of  stwnm.*. 

(c)  Each  data-flow  link  terminating  at  usmt  i  originates  either  at  an  input  of 
str.wm  >  or  at  an  output  of  some  other  application  c  where  c  occurs  within  s. 

(d)  Each  data  flow  link  originating  at  sum  m  i  terminates  at  either  an  output  of 
sirmm  j  or  at  an  input  of  some  other  application  c  where  c  occurs  within  s. 

(e)  Each  conditional-control  flow  link  which  terminates  at  a  originates  at  an 


Dependency  Directed  Reasoning 


8.1  A  Paradigmatic  Example  143 


application  c  which  occurs  within  s.  Furthermore,  the  conditional-control-flow  link 
originates  at  an  applicable  case  of  the  segment  of  c. 

The  relationship  is  transitive,  so  that  if  a  is  an  application  within  •  and  •  is  an 
application  within  c,  then  *  is  an  application  within  c  as  welL  We  express  this  as 
follows: 


(application  within  A  6) 

which  states  that  »  is  an  application  within  t. 

Frequently  it  is  more  useful  to  talk  about  applications  of  segments  of  a  particular 
(plan  or  spec)  type  rather  than  applications  of  a  particular  segment  We  use  the 
following  notation  to  express  this  idea; 


( Bppl 'CBl '0"  of • typf  p)»n-typ»>  •.  OPP I  UBl  '0«  >  ) 

The  above  can  be  defined  by  the  following  equivalence: 


(  OPP I 'cot  'on  of  typo  lypt-1  oppl-l) 

t 

(p'on-typo  [  -port  oppl-l]  typo  1 ) 

We  noted  earlier  that  in  the  mi  u«vi«s»i  plan  there  is  an  application  of  a  segment  of 
tvpe  tbu  tbovibsoi.  for  every  node  of  the  tree.  However,  some  of  these  applications 
occur  within  the  sub  segment  called  on  the  left  branch  while  others  involve  the  right 
In  this  case,  vie  .ire  concerned  not  only  with  applications  of  the  particular  surface  sub- 
segments,  but  also  with  applications  of  all  segments  of  type  tree  traversal 

We  say  that  there  is  an  occurrence  of  pi  an- type  type-1  within  the  application  A  of 
segment  l  if. 

(i)  There  is  an  application  •  of  segment  sis  i  within  the  application  » 

(ii)  The  plan-type  of  sis  1  is  rm-i. 

We  express  this  as  follows 


For  Complex  Program  Understanding 


144  The  Temporal  Viewpoint 


(occurrence  vitMn  veppl  icet'on*)  <pl*n-typ«>  <eppl  uetion-l) ) 

the  following  equivalence  holds: 

( occurrence -within  eppl-1  lypt-1  eppl-l) 

e 

lend  (  eppl ic»l lon-uithin  eppt-1  eppl-J) 

(  eppl icet ion- of • type  type  1  eppl-?)) 

We  can  now  state  a  formal  property  of  the  mt-TRAvttSAi  fragment  which  will  be 
useful  throughout  the  rest  of  this  discussion,  namely  that  it  visits  every  node: 

(For. all  (  eppl  1)  ( eppl icet ion-of ■ type  tree •  traversal  eppl-1) 

(For  all  (  node)  (node  [input  appl-l  the  tree)  node) 

(there  it  (  eppl  2)  (occurrence  aithin  eppl-1  tree-traverial  appl-7) 
tuch  that  (input  eppl-J  the-tree  node)))) 

It  is  convenient  to  think  of  sets  of  applications  actually  being  aggregated  into  a 
segment.  This  is  done  using  the  notion  of  an  occurrence  set.  The  occurrence  set  of 
phin  type  type  /  within  the  application  A  is  the  set  of  all  applications  within  A  whose 
plan 'type  is  type-1.  Intuitively,  the  occurrence  set  of  plan-type  type-1  is  a  virtual 
segment  consisting  of  that  part  of  the  program's  temporal  history  which  includes 
applications  of  a  particular  type  In  an  hi*gi  program,  for  example,  the  occurrence 
set  of  type  friigi  is  essentially  the  tree  traversal  fragment  of  the  program,  consisting 
of  all  the  recursive  calls  to  f«i«gi. 


Section  8.2:  Situations  and  Orderings 

Several  rules  are  used  to  impose  a  partial  order  (representing  temporal  ordering) 
on  the  situations  occurring  in  a  plan  diagram.  Most  obvious  are  those  imposed  by  the 
relationship  of  the  situations  to  the  segments  of  the  plan- diagram  The  following  rules 
capture  the  constraints  that  (1)  The  input  situation  of  a  segment  precedes  its  output 
situation,  (2)  The  input  situation  of  a  main  segment  precedes  the  input  situation  of 
anv  of  its  sub-segments,  (3)  The  output  situation  of  any  sub-segment  precedes  the 
output  situation  of  its  mam  segment 


Dependency  Direct'd  Reasoning 


8.2  Situations  and  Orderings  145 


(Rule  ((  f  (Is-a  segment  Segment)) 

(  g  ( Input  i 1 tuet ion  segment  :in-i1t)) 

(  h  (Output-ntuetion  segment  out-lit))) 

(Aliart  (Comes -before  in-lit  out-lit) 

<  Stg -lit -order  f  g  h ) ) ) 

(Rule  ((  t  (Sub  segment  mein-ieg  sub-log)) 

(  gl  (  Input-Htuation  : dam-log  in- 1  it  - 1 ) ) 

(  g.’  (  Input  ■  1 1  luot  ion  tub  log  :1n-|lt-{)) 

(  hi  (Output  - 1  ituot  ion  «om-ttg  out  lit  1)) 

(  nr  (Output -ntuation  sub-ieg  outsit-?))) 

(Aliart  (Comes  before  m-ilt-1  :ln-nt-I) 

(Nested  log  order  f  gl  g?)) 

(Anort  (Comes • before  out  •  1  '  t  -  ?  out  -  a  1 1  -  1 ) 

(  Hotted  leg  order  f  hi  :h?))) 

Data-flow  and  control-flow  links  are  the  only  other  constraint  ordering  the 
situations.  If  there  is  a  data-  or  control-flow  link  between  two  situations  then  the 
output  situation  of  the  first  segment  must  precede  the  input  situation  of  the  other. 


(«uto  <(  r  (o atefto*  seg  1  seg  ?)) 

(  g  (Output  situation  tog-1  out-Ht)) 

(  h  (  Input - lituat ion  log  ?  : In- lit))) 

(Assort  (Comes  before  out-lit  in-lit) 

(dflov  order  f  g  b))) 

(Aula  ((  f  (Controlf 1o«  seg-1  leg-?)) 

(  g  (Output - 1 ituot  ion  seg-1  out-lit)) 

(  h  (  Input  s ituot Ion  tog  ?  1n.|it)|) 

(Assort  (Comet -before  out-nt  m-nt) 

(dflov-ordor  f  g  h))) 

(Rule  ((  r  ( Condi t tonal -Control -f  low  (  teg-1  ceie-l)  teg-?)) 
(  g  (Output  situation  sag  1  out-nt)) 

(  h  ( Input  t 1 tuet 10"  sag?  in-iit))) 

(Assert  (Comes  before  out-lit  m-iil) 

( df 1 ow- order  f  g  h))) 


For  Complex  Program  Understanding 


146  The  Temporal  Viewpoint 


Since  these  constraints  in  general  only  impose  a  partial  ordering  on  the  situations,  plan 
diagrams  may  he  thought  of  as  representing  a  (pseudo)  parallel  computation  which 
imposes  only  the  minimal  constraints  on  segment  ordering  necessary  to  achieve  the 
goals  of  the  segment  specified  by  the  plan-diagram. 

Finally  we  may  define  a  notion  of  a  situation  belonging  to  a  particular 
application  a,  namely  that  it  is  an  input  or  output  situation  of  some  application 
occurring  within  * 

(  I  1 tuat ion  ot  appl icat ion  *  lit) 

■ 

(Thort-t,  (  b)  ( appl  ication  oithm  A  b) 

luch-that  (or  ( input  -  ntualian  b  lit) 

( output  ■  Htuat  lar  b  lit))) 


Section  8.3:  Temporal  Colleotions 

I  would  like  to  define  the  idea  of  a  collection  of  objects  distributed  in  time, 
rather  than  gathered  together  in  a  data-structure.  The  motivation  of  this,  as 
mentioned  earlier,  is  to  be  able  to  describe  what  a  program  fragment  like  rm-rtAvtUM. 
does.  1  will  show  that  mi  tmvusai  may  be  regarded  as  producing  such  a  temporal 
collection  whose  members  are  exactly  the  nodes  of  the  input  tree. 

A  temporal  collection  is  a  set  of  pairs  of  objects  and  situations  such  that  each 
objei  t  exists  in  the  situation  with  which  it  is  paired.  We  may  talk  about  elements  of 
the  collection  and  their  object  and  situation  parts  as  follows: 


(|1rn«nt  '  tMPorpl  ■  to'  l#t  t  '•«>  <ala«ant>) 

(Objact  part  <a'a«a"t>  <objait>) 

(Situation  part  <0laawnt>  < 1 1  tuat ion> ) 

Since  the  second  two  of  the  predicates  are  functions,  they  made  be  referred  to  with 
the  bracket  notation 


Dependency  Directed  Reasoning 


8.3  Temporal  Collections  147 


C  is  a  temporal  collection  of  the  application  A  if i 

(i)  c  is  a  temporal  collection 

(ii)  Every  element  of  c  has  a  situation  part  which  is  a  situation  of  application  *. 

If  is  easy  to  create  temporal  collections  by  picking  applications  of  sub-segments 
within  some  main  segment.  For  each  such  sub-segment  we  can  chose  an  (input) 
output  object  and  pair  it  with  the  (input)  output  situation.  However,  it  is  usually 
more  useful  to  consider  an  occurrence  set  of  a  particular  type  and  to  construct  the  set 
formed  by  those  objects  which  are  assigned  to  a  particular  input  (output)  port  of  each 
occurrence.  These  objects  are  then  paired  with  the  input  (output)  situations  of  each 
occurrence  to  form  a  temporal  collection 

For  example,  we  may  consider  the  occurrence  set  of  type  tree -traversal  within  any 
application  of  tru- traversal.  For  each  such  occurrence,  chose  the  input  which  is 
assigned  to  tree  traversal's  input  port  tmetrce.  Pair  each  such  object  with  the  input 
situation  of  the  occurrence  to  which  it  is  an  input.  This  temporal  collection  contains 
exactly  the  nodes  of  the  tree  input  to  the  outermost  application  of  tree  •  traversal. 

From  now  on  I  will  use  the  term  temporal  collection  only  to  refer  to  temporal 
collections  generated  by  the  <name>  input  (output)  of  the  occurrence  set  of  <plan-type> 
within  the  application  a  It  is  denoted  as  follows: 


(Ttmpo'tl  Collection  A  Th*  Tr»«  Tr»t  Trpvtripl  c) 

(Temcor*!  Collection  <  epo  I  ic  at  ion  >  <  input  •  n*»e '  cplen  typei  <col  lect'on-neme)  ) 

which  says  that  c  is  the  temporal  collection  generated  by  tme-tree  input  port  of  the 
occurrence  set  of  type  tree  trpversal. 

So  far  I  have  talked  about  a  temporal  collection  as  a  set  of  pairs  of  objects  and 
situations.  Such  pairs  are  said  to  be  elements  of  the  temporal  collectioa  However, 
the  interest  is  usually  not  with  the  pairs  but  only  with  the  object  parts  of  the  pairs. 
An  object  is  a  member  of  a  temporal  collection  if  there  is  an  element  of  the  temporal 
collection  whose  object  part  is  the  object  in  question.  Notice  that  under  the  element 
relationship  the  temporal  collection  is  a  set;  we  are  only  interested  in  the  presence  of 
a  pair  in  the  collection,  not  the  number  of  occurrences.  However,  under  the 
membership  relationship,  the  temporal  collection  is  a  multi  set,  with  objects  occurring 
more  thanohee. 


For  Complex  Program  Understanding 


148  The  Temporal  Viewpoint 

As  with  other  data-structures,  it  is  frequently  important  to  have  an  ordering 
relationship  on  the  objects  in  a  temporal  collection.  The  temporal  ordering  of  the 
situations  provides  a  natural  method  for  defining  such  an  ordering  We  say  that 
element- 1  precedes  element- 2  in  the  temporal  collection  C  if: 

(i)  There  is  an  element  of  c  whose  object  part  is  (UMtnT-i  and  whose  situation  part  is 

situation  i. 

(ii)  There  is  an  element  of  c  whose  object  part  is  ciewnt-i  and  whose  situation  part  is 

SITUATION  ?. 

(iii)  situations  conies-before  situation  r. 

Notice  that  since  situations  are  only  partially  ordered,  the  elements  of  a  temporal 
collection  are  in  general  only  partially  ordered.  If  the  ordering  of  the  elements  of  a 
temporal  collection  is  total,  then  we  say  that  the  collection  is  a  temporal  sequence. 


Section  8.4:  Temporal  Collections  Inputs  and  Outputs 

We  now  extend  the  specification  language  to  allow  temporal  collections  to  serve 
as  segment  inputs  and  outputs.  Thus,  the  specs  for  tree -traversal  may  now  be  stated 
as  follows; 


Kt'iPttl  tr*..tr*,*r.*l 

(Input!  th*-U**) 

<l*P»ct  (  Objtc  t  •  typ*  th«tr*»  lm*ry- tr*«  ) ) 

(Output!  tht - nod*! ) 

(Anprt  (Ot>j»ct  typ*  th«-nod*i  t**por«t -CP) loct ion ) 

(Tor-oil  (  th*-nod<)  (nod*  tl**-tr*«  .tho-notfo) 

(•mfctr  tn«  nod*i  tho-nod*)) 

(to'’  *n  (  th*  nod*)  (a***«r  th«  •  nod*  I  th*-nod«) 

(nod*  th*-tro*  : lh* -  nod*  ) ) ) ) 

Which  states  that  tree -traversal  produces  a  temporal  collection  output  whose  elements 
are  exactly  the  nodes  of  the  input  tree.  We  can  now  state  the  plan  diagram 
assertions  which  link  the  temporal  collection  generated  by  the  occurrence  set  of  mt- 

TRAVERSAL  tO  the  Output  TNI  -MOOIS. 


Dependency  Directed  Reasoning 


8.4  Temporal  Collections  Inputs  and  Outputs  149 


(d«fp1»n  trtt- trtvtrn) 

(tub  ttgMtnti  lilt-ini  proc«iinon-l*r«lntl 
proctlt- t»r«nn»l ) 

(f  loo  ditgrtM 


( d*t*f loo 

( tf*por*t -col  IdCt  ton  th*  *  tr#t  loft  tr»»- tr#*»n»l  ) 

(output  trot  trovoriol  th«  nodti)) 

) 

(  coot Irt 'nt * 

(tote  t/pt  tt»t  ‘ttf  ’»*»  M 

(pit"  d'tj'o*  proctit  nen  ttronnil) 

>) 

It  is  now  possible  to  prove  that  ihc  plan  diagram  for  mi-tusvwssi  is  consistent 
with  its  specs.  In  doing  so  REASON  uses  a  form  of  computational  induction, 
assuming  that  a  temporal  property  holds  of  all  occurrences  of  a  particular  plan  type 
within  the  main  application.  If  on  the  basis  of  this  assumption  it  can  deduce  that  the 
same  property  holds  for  ihr  main  application,  then  it  is  legitimate  to  conclude  that  the 
property  is  true  for  any  application  of  that  plan-type.  (This  is  actually  only  weak 
correctness  in  the  Uoare  sense  since  this  does  not  prove  termination).  I  call  this  form 
of  computational  induction  plan  type  computational  induction. 


For  Complex  Program  Understanding 


150  The  Temporal  Viewpoint 


Plan  Diagram  of  Tree  Traversal 


To  begin  the  proof  REASON  assumes  that  there  is  an  arbitrary  application  rat* 
of  a  segment  of  plan-type  mt-TumsAi.  Anonymous  objects  are  chosen  to 
stand  for  the  input  situation  of  the  application.  Anonymous  objects  are  also  chosen 
to  stand  for  the  inputs  to  the  application. 


(application  o'typ*  Iriilrmnil  tr*«. tronoriol  •  I ) 

(input  IrM  lrinrtll-l  tho-tf»o  trao-l) 

(  input  t ituat ion  tro*  trayortat-1  »it  l) 

Also  the  input  expectations  of  the  ttu-tuviUM  specs  are  asserted  in  the  input 
situation  of  the  application. 

( (  S  many  -  tf  ••  tr#»-l)  *  1 1  •  1  > 


Dependency  Directed  Reasoning 


8.4  Temporal  Collections  Inputs  and  Outputs  151 


The  data-flow  links  in  the  plan  diagram  show  that  hst-uaf  is  ready  for  application. 
REASON  decides  that  it  is  impossible  to  prove  either  that  mi-i  is  a  terminal  or  that 
it  is  not.  Thus,  a  case-split  is  created,  assuming  in  the  first  case  that  rait-j  is  a 
terminal.  The  plan  diagram  indicates  that  no  other  segments  are  applicable  and  that 
control  flows  to  the  main  segment’s  output  The  system  must  prove  that  the  main 
segment’s  assert  clauses  hold  for  the  delivered  output  objects. 

There  is  only  assert  clause  and  it  states  that  every  node  of  tbu-j  must  be  present 
in  the  output  i hi  Boots  which  is  the  temporal  collection  generated  by  the  occurrence 
set  of  ust-uai.  This  is  trivially  true  since  the  only  node  of  a  terminal  node  is  itself, 
and  the  only  occurrence  of  ust  itAr  had  mt-i  as  its  input  Thus,  the  plan  is  valid 
under  the  assumption  that  mt-i  ts  a  terminal. 

If  TBit-i  is  non  terminal  then  CAst-r  of  usi-haf  is  applicable  and  pnoctss-noN- 
tibhibai  is  ready  for  application.  REASON  moves  through  the  plan  diagram  evaluating 
the  sub-segments  in  turn.  Since  REASON  has  assumed  that  mt-i  is  non-terminal, 
the  segments  un  and  bigmi  are  applicable;  objects  itua  and  bigmt-i  are  created  to 
represent  their  outputs  and  assertions  are  added  to  represent  the  fact  that  Ltrr-i  is  the 
left  node  of  mi  i  and  that  bi6ht-i  is  the  right  node. 


(output  th»- t*f t-nodt  Itft  l*ft-l) 

(output  IHt  -  r  ight  •  nod*  rigM  rigM-1) 

(loft-node  tree  I  left-1) 

(right-node  treo-l  right!) 

Data-flow  links  then  map  these  objects  to  i  tf  t  •  tbavibsai  and  bight- tbavibsai  which  are 

applicable  since  they  only  require  their  input  to  be  a  binary-tree  node.  Both  urr-i 

and  b )c.HT  - 1  are  binary- tree  nodes  since  they  are  nodes  of  the  binary  tree  tbh-i. 

REASON  next  attempts  to  show  that  the  temporal  collection  generated  by  the 
occurrence  set  of  type  uai?  includes  exactly  the  nodes  of  nu-i.  The  proof  is  by 
(plan-type)  computational  induction.  REASON  assumes  that  any  occurrence  o-i  of 
plan-type  tbu  tbavibsai  within  the  main  application  satisfies  the  property  which  it 
wishes  to  prove,  i.e.  that  the  temporal  collection  generated  by  the  occurrences  of  type 
itAf 7  within  o-i  has  exactly  the  nodes  of  the  input  tree  as  its  members.  Thus*  the 

occurrences  of  u»ft  within  uf t-tbavibsai  and  bignt.tbawbsai  each  generate  a  temporal 

collection  including  exactly  the  nodes  of  uft-i  and  bight-i. 


For  Complex  Program  Understanding 


152 


The  Temporal  Viewpoint 


.  Consider  .in  arbitrary  node  noot-i  of  ratt-i.  By  definition  moot-i  is  either  mt-i 
itself  or  a  node  of  the  left  node  of  ikc-i  or  a  node  of  the  right  node  of  mi-u  If 
Moot  1  is  identical  to  mt-i  then  it  is  the  input  to  tist-lcaf  which  is  of  type  haft.  If 
Hoot  i  is  not  identical  to  ihm  then  it  is  a  node  of  either  uft-i  or  risht-i.  In  either 
of  these  cases,  RF.ASON  has  shown  by  induction  that  it  is  a  member  of  a  temporal 
collection  generated  by  the  occurrences  of  type  u*rr  within  icft-travcrsai  or  risht- 
t r a vi  b sai .  Since  these  occurrences  are  within  a  sub-segment  of  iRtt  -  traversal,  they  are 
sub-segments  of  iru  iravirsal  itself.  Thus  all  the  nodes  of  tru-i  are  in  the  temporal 
collection  generated  by  the  occurrence  s  of  usr-i.tAF.et 


Section  8.5:  Temporal  Collection  Data-flow* 

Temporal  collections  are  a  useful  abstraction  mechanism  only  if  they  can  serve 
not  only  as  outputs  of  a  segment  but  also  as  inputs.  In  this  case,  data-flows  between 
segments  might  involve  the  flow  of  a  temporal  collection  output  of  one  segment  to  a 
temporal  collection  input  of  a  second  segment  This  single  data-flow  statement  is, 
however,  an  abstraction  of  the  temporal  behavior  of  the  plan,  summarizing  many 
identifiable  ifota- flows  between  sub-segments  of  the  two  plans 

For  example,  in  the  temporal  model  of  the  fringc  program  we  now  have  a  sub- 
segment  called  mi  travirsai  which  outputs  a  temporal  collection  containing  the  nodes 
of  its  input.  This  is  connected  by  a  data-flow  link  to  uouirtial-accumulatioii  which 
takes  a  temporal  collection  input  This  single  link  summarizes  the  fact  that  each 
occurrence  of  type  it  ah  within  tru  iravirsai  has  a  corresponding  occurrence  of  cows  in 
siot’tutiAi  accuhuiatio*  with  a  data-flow  link  connecting  the  two 

stoutniAi  accuhuiatioa  can  be  given  a  simple  set  of  specs  saying  that  it  takes  as 
input  a  temporal  collection  and  returns  as  output  a  single  object  which  contains 
exactly  the  same  members  of  the  temporal  collection 


Dependency  Directed  Reasoning 


8.5  Temporal  Collection  Data-flow*  153 


(dtfiptct  S«qu*nl  i*l  AcciWHiUt  ion 
(  input  I :  COUtCtlon-1) 

(•■ptcl  (Ot>j«ct-  typ«  cot  ltd  ton- 1  tonporo) -collodion)) 

(  output  I  Ut- 1 ) 

( tl  tor  t  (tor-oil  (  incmbor-l)  (»e»b«r  colloction-l  membtr-1) 

(»e*btr  lilt-1  mtmotr-l)) 

(for-tll  (  •t»b*r-l)  (»o»btr  lnt-1  mr»bor-l) 

( mofrtbc  r  colloction-1  :oioo«bor  - 1 ) ) ) ) 

The  in  ter  n:il  plan  for  seouintiai -accumuimiok  is  intuitively  a  cascade  of  cons  segments, 
each  taking  one  input  from  the  temporal  collection  and  the  other  input  from  the 
output  of  the  preceding  cons.  The  first  cons  takes  nh  as  input  and  the  last  delivers  its 
output  as  the  output  of  the  whole  segment 


Sequential  Accumulation  with  Temporal  Collection  Input 

In  describing  the  mt  t«»vi«sai  plan  fragment  I  needed  the  notion  of  a  temporal 
collection  being  generated  b>  the  occurrence  set  of  the  same  plan  type.  The 

st  out  nt  i  ai  accuwi  at  ion  fragment  requires  an  inverse  to  this  idea  We  may  think  of  two 

distinct  tspes  of  segments,  those  which  generate  a  collection  of  values  and  those  which 
utilize  the  collection  In  the  first  kind  of  segment  it  is  convenient  to  talk  about  the 

occurrences  of  segments  of  a  particular  type  generating  a  temporal  collection;  in  the 

second  kind  of  segment,  we  require  the  notion  of  a  temporal  collection  determining 
the  occurrences  of  sub- segments  of  a  particular  type  When  we  say  that  a  segment 
takes  a  temporal  collection  as  input,  we  are  summarizing  the  idea  that  the  object*  in 


For  Complex  Program  Understanding 


1 54 


The  Temporal  Viewpoint 


the  temporal  collection  are  in  a  one  to  one  correspondence  with  the  members  of  an 
occurrence  set  within  the  segment.  We  refer  to  such  an  occurrence  set  as  the 
temporal  collection  generated  occurrence  set,  which  we  denote  as  follows 


(  tc  a*"  occ  <p  I  in  •  t  ypi>  <»it-of  -  occur ranca - nama > ) 

This  allows  us  to  represent  the  flow  of  objects  of  a  temporal  collection  to  a  collection 
of  occurrences  by  a  single  data-flow  statement  as  is  done  in  the  statements  of  the 
following  pl.in  diagram  (which  is  shown  pictorially  above): 


(dafplan  saquant iil - accumulat ion 
( tub - ligmant I  init  Itq-occ-l) 

(  f lou  diagram 

(dataflow  (input  icquantial -accumulation  tha- tamp- col ) 

( input  iaq-acc-1  thf - t amp -co I ) ) 

(dotation  (output  itq-acc-l  thi-aniwar) 

(output  laguant  lal -acctmtulal  ion  ttio-am  war)) 

(dotation  (output  mit  tna •  amply- 1  it l ) 

(  input  itq-occ-l  tb#-*nMlol-»olup))> 

(comtromtt 

(plan  typo  mil  ganarata - amply- 1 iat ) 

( pi  an- diagram  saq-acc-1 
( lub- lagmantt  actu»  op  racur) 

(  conitramtl 

(ipac-typa  accua-op  accumulata- into- 1  lit ) 

(plan  typ*  racur  «*q  acc  1)) 

( f ion  diagram 

(dotation  (input  iaq-acc-1  tn« ■ tamp • col ) 

(  input  tegtnoce  iccw  op  tha •  currant  •  value ) ) 

(dotation  (output  accu>  op  nan  lut) 

(input  racur  currant - accua) ) 

(dilation  (output  racur  tha-aninar) 

(output  iaq-acc-1  l ha-aninar )))))) 

Notice  th.it  the  data-flow  links  between  the  various  occurrences  of  accum-on  impose  a 
temporal  ordering  on  their  execution  which  (in  this  case)  is  total.  We  require  the 
ordering  of  elements  of  the  temporal  collection  to  be  consistent  with  the  ordering  of 
the  segments  into  which  they  flow.  This  is  not  an  issue  in  the  plan  for  hum  since 
the  version  of  rut-mvttSAi  we  are  considering  is  so  abstract  that  it  specifies  no 
ordering  on  the  elements  generated.  Its  output  and  hence  the  input  of  siqucntim.- 
accumulation  is  totally  unordered;  any  mapping  of  the  elements  input  to  scqucntial- 


Dependency  Directed  Reasoning 


8.5  Temporal  Collection  Data-flows  155 


AccuHuuu  10*  is  consistent  with  the  segment  ordering  of  the  occurrences  of  accum  or. 

However,  if  we  were  to  consider  a  more  specific  mi  mvtasAi.,  say  one  which 
traverses  in  left- to-  right,  depth-first  order,  then  there  would  be  restrictions  in  the 
mapping;  given  two  nodes,  the  node  which  is  further  to  the  left  in  the  tree,  will  be 
mapped  into  an  earlier  occurrence  of  accuhof  and  will,  therefore,  appear  earlier  in  the 
output  of  sto  acc  t.  This  requirement  guarantees  that  the  stout»TjAi-AccimuLATioti  plan 
will  not  lose  ordering  constraints  which  were  of  significance  to  the  segment  which 
generate*.,  its  inputs. 

REASON  can  prove  that  the  scouting  accuwi.atio*  plan  satisfies  its  specs  quite 
easily.  It  once  again  uses  (plan-type)  computational  inductioa  In  summary,  the 
argument  divides  into  two  parts.  The  simple  part  is  that  if  sco  *cc  i  satisfies  its  specs 
of  accumulating  all  the  elements  of  its  temporal  collection  input  into  the  list  which  is 
its  other  input,  then  stout atiai  accumuiatio*  satisfies  its  specs  trivially  since  all  it  does  is 
to  creaie  an  empty  list  and  then  call  sto  acc  - i  with  this  list  as  one  argument  and  the 
temporal  collection  as  the  other. 

Now  REASON  assumes  that  all  internal  occurrences  of  siq-acc-i  satisfy  their 
specs.  It  follows  that  accu*  of  produces  a  list  containing  all  members  of  the  curriat- 
AccoMui io«i  input  plus  the  one  additional  element  which  is  its  other  input  This  element 
is  a  member  of  the  temporal  collectioa  The  inputs  to  the  recursive  call  of  sio-acc-i 
are  the  new  accumulation  and  the  remaining  members  of  the  temporal  collectioa  By 
the  specs  of  sto  acc  i  this  will  return  a  list  containing  all  the  objects  which  were 
members  of  its  list  input  plus  all  the  objects  which  were  members  of  the  temporal 
collection  input.  I  hus,  it  produces  a  list  of  all  the  members  of  the  temporal 
collection  input  to  the  main  program.  By  induction  REASON  can  conclude  that  sto- 
acc  i  satisfies  its  sjvecs. 

It  follows  that  a  plan  which  is  a  functional  composition  of  t«ti-T*Avt«s*i  with 
stcutATiAi  accu"uiatio«  will  produce  a  list  of  the  nodes  of  the  tree.  If  a  mihrik  plan  is 
put  between  them  so  that  only  the  terminal  nodes  of  the  tree  are  members  of  the 
temporal  collection  output  of  the  filter,  then  we  will  have  a  plan  for  computing  the 
fringe  of  the  tree.  It  remains  to  show,  however,  that  our  original  frimsi  program  can, 
indeed,  be  looked  at  in  this  way. 


For  Complex  Program  Understanding 


IS6 


The  Recognition  Paradigm 


Chapter  9:  The  Recognition  Paradigm 

The  apprentice  depends  on  a  library  of  pre-analyzed  plan  fragments  for  use  in 
analysis  by  inspection  So  far,  I  have  presented  a  formalism  which  allows  these 
fragments  to  be  stated  m  a  general  and  abstract  manner.  Fragments  like  mt-ramasM. 
or  stout  him  «cuwn»no«  can  be  easily  applied  to  any  programming  language  and  to  a 
variety  of  data  structures  and  syntactic  constructs  which  represent  similar  temporal 
behaviors 

I  he  apprentice  uses  pre  analyzed  plan  fragments  to  help  explain  parts  of 
programs,  matching  sections  of  program  code  to  particular  fragments.  If  all  parts  of  a 
program  arc  mapped  onto  some  plan  fragment  and  if  these  fragments  are  connected  in 
coherent  ways  and  if  the  entire  artifact  so  constructed  implements  a  desired  behavior, 
we  can  then  say  that  the  program  has  been  analyzed.  In  such  cases  the  plan 

fragments  have  been  used  as  if  they  were  proof  rules  (of  a  rather  macro  character), 
showing  that  their  preconditions  hold  and  asserting  their  conclusions. 

Howes er,  it  will  often  be  the  case  that  only  some  of  the  code  can  be  mapped 
onto  fragments  in  the  library  of  plans.  In  such  cases  it  would  be  erroneous  to  assert 
that  the  program  had  been  verifies!  (or  totally  understood).  Nevertheless,  the  program 
has  hern  partially  understood  and  hJt  a  loaf  is  better  than  none. 

The  recognition  process  proceeds  as  follows.  First,  a  simple  language  dependent 

process  translates  the  s outer  language  program  text  into  a  plan  diagram.  Such 
programs  ha\e  been  des eloped  for  I  ISP  by  Rich  (Rich  Sc  Shrobe,  1976]  and  for 
FORTRAN  by  Waters  (Waters,  1977]  This  diagram,  called  the  surface  plan,  is 
typically  quite  unstructured,  having  only  that  rudimentary  segmentation  which  is 

implied  T>\  primitives  such  as  ir  i»u  mi,  no,  conn,  *»ocrou»i  cmi  etc  Data-flow  links 
are  deduced  by  a  symbolic  interpretation  (developed  by  Rich  in 
(Rich  A  Shrobe,  1976p  of  primitives  such  as  assignment  to  variables,  nested  function 
applications  etc.  After  this  translation  to  plan  diagrams,  the  raw  code  is  not 
consulted,  although  links  to  it  are  maintained. 

A  recognition  mapping  of  this  surface  plan  consists  of  an  aggregation  of  some  of 
its  original  segments  into  larger  segments,  a  mapping  of  these  to  the  segments  of  a 
library  plan  such  that  all  plan-type  and  spec-type  constraints  of  the  library  plan  are 
satisfied  by  the  surface  plan  segments,  and  such  that  the  data-flow,  control-flow  and 
conditional  control  flows  of  the  surface  plan  are  consistent  with  those  of  the  library 


Dependency  Directed  Reasoning 


9  The  Recognition  Paradigm  1 57 


plaa  If  such  a  recognition  mapping  can  be  constructed,  it  then  follows  that  any 
property  of  the  deep  plan  is  also  true  of  the  corresponding  surface  plaa 

Actually,  the  above  conditions  are  slightly  too  strong  In  constructing  a 
recognition  mapping  it  is  allowable  to  ignore  some  of  the  inputs  or  outputs  of  surface 
plan  segments.  Data  flows  connected  to  such  inputs  and  outputs  must  also  be  ignored 
as  must  thcwe  spec  clauses  which  mention  such  objects.  Within  the  surface  plan,  sub¬ 
segments  which  are  connected  to  ignored  inputs  must  also  be  ignored.  This  allows  us 
to  separate  a  segment's  behavior  into  those  parts  which  involve  a  set  of  objects  under 
consideration  and  thove  which  do  not 

When  we  say  that  a  surface  segment  satisfies  a  spec-type  of  the  library  plan  we 
mean  that  we  can  select  some  of  the  input  and  output  objects  of  the  surface  plan 
such  that  the  I/O  behavior  of  the  library  plan  segment  can  be  shown  to  hold  for  the 
aspect  of  the  surface  segment's  behavior  which  involves  only  the  objects  selected.  A 
similar  principle  applies  to  saying  that  a  surface  plan  segment  satisfies  the  plan  type  of 
its  corresponding  library  plan  segment. 

Notice  that  a  recognition  mapping  is  not  required  to  map  every  surface  plan 
segment  into  a  library  plan  segment.  However,  if  a  set  of  recognition  mappings  have 
been  constructed  such  that  each  surface  plan  segment  is  in  the  domain  of  at  least  one 
completely  constructed  recognition  mapping,  then  we  say  that  the  surface  plan  has 
been  completely  recognized. 

I  will  now  proceed  to  show  how  the  original  taiwt  program  can  be  completely 
recogm/rd  as  a  composition  of  mi -h»vi»sm,  uu  fiutainc,  and  Sioummi-Accumiuno*. 
Following  a  sketch  of  this  recognition  process  I  will  indicate  how  to  extend  the 
notions  of  recognition  so  as  to  gain  greater  abstraction  power. 

The  construction  of  the  recognition  mapping  depends  on  an  inductive  argument. 
We  wish  to  show  that  bv  ignoring  the  *ccuhui»tic»  input,  we  may  regard  r«i«st  as  an 
instance  of  the  mt  r«»vns*i  plan.  Ignoring  the  *ccumui*tio*  input  forces  us  to  ignore 
the  *caw  oe  segment,  as  well  as  the  *ccu*a*tio«  input  to  the  two  recursive  calls.  We 
can  then  construct  the  straightforward  recognition  mapping  of  hst-uaf  in  r»m«  to 
itst  uu  in  mi  i»*vt»s*i,  u»t  to  utt,  mm  to  tiSHt,  and  (»iskt)  um-mium  to  («I6mt) 
terr  mc  imvmjai.  Each  of  these  segments  except  the  two  mast  segments  satisfy  their 
spec  type  constraints  trivially.  For  the  two  f»i*m  segments  the  induction  hypothesis  is 
invoked  giving  the  desired  result 


For  Complex  Program  Understanding 


158  The  Recognition  Paradigm 


Recognition  of  Fringe  as  a  Tree  Traversal 


This  partial  recognition  tells  us  that  the  temporal  collection  generated  by  the 
occurrence  set  of  type  within  »«!■«  includes  exactly  the  nodes  of  the  input  tree 
We  now  have  to  construct  a  partial  recognition  for  the  scout*n«.-*ccu*ui.*Tio«  plan. 

I  hr  recognition  mapping  identifies  the  occurrences  of  the  cows  segment  in  rataci 
with  the  *cru*  c*-\  of  stouttmi  *ccu"ui»ttor,  the  data  flows  from  ust-um  to  the  cons  are 
mapped  onto  the  temporal  collection  data-flow  to  the  xccu*  o*s.  The  remaining  data¬ 
flows  of  the  library  plan  for  si<juixn*i  *ccu"w.»uo«i  then  require  us  to  show  that  the 
first  ce«s  in  receives  an  empty  list  as  its  input,  that  the  output  of  each 

occurrence  of  co«is  flows  to  the  next  occurrence,  and  that  the  output  of  the  last 
occurrence  of  co«o  flows  to  the  output  of  the  whole  segment 

The  proof  of  these  claims  is  a  straightforward  (plan-type)  induction.  We  assume 
that  the  occurrence  set  of  type  ecus  within  each  internal  application  of  type  mwe  is  in 
fact  a  sioutxmi  *ccumn.»uo*c  There  are  two  cases  to  consider.  If  the  tree  input  to 
r»i*r.i  is  a  terminal,  then  there  is  exactly  one  occurrence  of  cons;  its  inputs  are  the 
terminal  node  itself  and  the  cu»«t«t  *ccuMui»Tio*i  input  of  raiaw;  its  output  is  the  output 
of  '•iw.i.  Thus,  in  this  case  all  the  requirements  are  met  and  we  may  regard  the 
occurrence  set  of  type  ct*s  as  a  si<nmtui-*ccuhui»tio«. 


Dependency  Directed  Reasoning 


9  The  Recognition  Paradigm  159 


Recognition  of  Fringe  as  Sequential  Accumulation 


In  the  second  cave,  the  tree  input  to  m*oi  is  non  terminal  and  there  are  two 
recursive  occurrences  of  type  mwi.  We  can  assume  the  induction  hypothesis  for  both 
the  mwi  and  «i6*»i  mact  occurrences  of  i»i«m.  We  now  construct  a  case 

analysis.  B>  the  induction  h>pofhesis  the  occurrence  sef  of  type  co«j  within  each 
internal  application  of  type  rtiau  is  a  stout»im-KCvNuuti<m.  Th  leaves  four  special 
occurrences  of  co*s,  namely  the  first  and  last  occurrences  within  utt  iiiNt  and  •  ismt - 
»»i»6i.  All  other  occurrences  of  c<m  cascade  their  outputs  to  the  next  occurrence  and 
receive  one  of  their  inputs  from  the  previous  occurrence 


For  Complex  Program  Understanding 


160  The  Recognition  Paradigm 

I  he  special  cases  are  handled  easily  (see  diagram  below).  The  data-flow  from 
sight  fbibgi  to  lift  fbibgi  implies  that  all  occurrences  of  cons  in  hignt -fkih6c  precede  all 
occurrences  of  cobs  in  uf i-raiaGt.  Thus,  cobs-b,  the  first  occurrence  of  cobs  within 
bight  f b i bgi  is  the  first  cobs  within  fbib6i;  similarly  cobs-j,  the  last  cobs  of  uft-fbingc  is 
the  last  cobs  of  fbibgi.  But  the  data  flow  links  of  the  fbibgi  plan  states  that  c-accum-i, 

the  CUBBIBI  ACCUHUIATIOB  input  tO  FBIBGI,  HOWS  tO  the  CUBBIBT-ACCUMUIATIOB  input  tO  BIGHT- 

FBiBGt.  By  the  induction  hypothesis  bight-fbibgi  is  an  instance  of  siouibtiai..accuhui.atio«, 
thus  c  *ccu*i  (  flows  to  the  first  cobs  of  bignt.fbib&i,  ie  to  the  first  cobs  of  ratacs. 
Thus,  the  cubbibt  accuhuiatiob  input  to  fbibgi  flows  to  the  first  occurrence  of  cobs  within 

FBIBGI. 

Similarly  the  data-flow  links  of  the  fbibgi  plan  state  that  the  output  of  tbihgi  is 
bbsvib  i,  the  output  of  uft  fbibgi.  However,  since  by  the  induction  hypothesis  urr- 
»bibgi  is  an  instance  of  siouibtui  bccubuiatiob,  its  output,  abswib-i,  is  the  output  of 
cobs  j,  the  last  occurrence  of  cobs  within  both  urr-rnntt  and  fbibgi  itself.  Thus, 
thi  absmb  output  of  fbibgi  is  the  output  of  the  last  occurrence  of  cobs  within  fbib61. 

The  two  remaining  special  cases  are  cobs  i,  the  last  cobs  of  bight-fbibgi,  and  cobs-i, 
the  first  cobs  of  uft. fbibgi.  The  data-flow  links  of  fbibgi  state  that  answib-s,  the 

THI  ABSW1B  output  of  BIGHT  FBIBGI,  floWS  tO  TM{  CUBBIBT  •  ACCUFtUl  AT  IOB  input  Of  HFTFBIBGl. 

Again,  by  the  induction  hsjsothesis  since  abswb  t  is  the  output  of  bight-fbibgi,  it  is  also 
the  output  of  cobs  i,  the  last  cobs  of  bignt-fbibm.  Similarly,  since  it  is  the  input  of 
uft  fbibgi  it  is  also  the  input  of  cobs-i,  the  first  cobs  of  uft-fbibci.  Thus,  the  output 
of  cobs  i  flows  directly  to  the  input  of  cobs-i,  satisfying  the  induction  requirements* 


Dependency  Directed  Reasoning 


m 


9  The  Recognition  Paradigm  161 


To  complete  the  recognition  of  ittxsi  as  a  mi  tmvusai  composed  with  a 
st  out  at  i  at  ■  4CCUMH  to  io«,  the  final  tning  we  need  to  show  is  that  the  temporal  collection 
output  of  tsii  »»«vi«s»t  flows  to  the  temporal  collection  input  of  st<mimi-«ccu"ui«rioN. 
In  terms  of  the  recognition  mapping  which  has  been  constructed,  this  means  that  the 
temporal  collection  generated  by  the  occurrence  set  of  type  it»»t  should  flow  to  the 
occurrence  set  of  type  co*v  Again  this  is  a  direct  result  of  a  simple  inductive 
argument.  If  the  input  tree  is  a  terminal  then  the  result  is  trivial,  there  is  exactly  one 
data-flow  between  ust-ic«,  the  one  occurrence  of  type  u*r»  and  the  one  occurrence 
of  type  co«v  In  the  non  terminal  case,  the  induction  hypothesis  says  that  there  is  a 
temporal  collection  data  flow  between  the  occurrence  set  of  type  i tur  and  the 
occurrence  set  of  type  co*s  in  both  un-raim  and  iismt-ixiiku.  But  if  the  input  tree  is 
non-terminal,  these  are  the  only  occurrences  of  type  cons  within  rainsi.  It  follows  that 
the  temporal  collection  input  to  the  occurrence  set  of  type  co«s  within  ratau  is  exactly 
the  temporal  collection  generated  by  the  occurrence  set  of  type  ium  within  »«i«m. 


For  Complex  Program  Understanding 


162  The  Recognition  Paradigm 

Thus,  we  have  succeeded  in  mapping  the  fainge  program  onto  the  plan  formed  by 
a  composition  of  tau  taavtasal  with  sequential  accumulation.  A*  noted  earlier,  an 
immediate  consequence  of  this  result  is  the  fact  that  the  computed  accumulation  contains 
exactly  the  terminal  nodes  of  the  tree. 

It  might  seein  at  first  glance  that  we  have  developed  an  extremely  cumbersome 
technique  where  simpler  methods  might  suffice  Two  factors  mitigate  against  this. 
F  irst,  much  of  the  work  we  have  shown  here  is  illustrative  only  and  would  not  be 
required  in  the  actual  routine  recognition  of  programs  like  fainge.  The  cascade  of  two 
scout n 1 1 At  accumulations,  for  example,  is  so  commonly  used  that  it  would  be  represented 
hi  the  plan  library  as  an  instance  of  stout  at  ial -accumulation.  Similarly,  the  plan  library 
could  include  a  plan  called  tau-taaviasal-a-action  representing  the  general  class  of 
programs  which,  like  fain&c,  which  traverse  a  tree  and  act  upon  the  nodes  produced. 
Thus,  virtually  all  the  work  I  have  shown  can  be  pre-proven  and  filed  away.  The 
actual  recognition  would  be  quite  simple  and  involve  little  work.  It  should  also  be 
remembered  that  properties  of  tau  taaveasal  and  similar  cliches  are  pre-proven.  The 
apprentice  separates  the  work  of  analyzing  plans  from  that  of  analyzing  programs, 
reducing  the  program  understanding  task  to  the  relatively  simple  task  of  analysis  by 
inspection. 


Section  9.1:  Abstract  Plows,  Data  and  Control  Pathways 

The  strength  of  this  method  depends  on  the  ability  to  abstract  out  details  of  the 
code  in  order  to  view  the  program  as  an  instance  of  a  more  abstract  but  better 
understood  artifact.  Abstraction  techniques  allow  the  plans  to  achieve  greater 
generality.  So  far  we  have  seen  techniques  for  abstracting  procedural  behavior  (specs), 
various  issues  of  data-flow  (temporal  collections)  and  control-flow  (temporal  control 
sequences).  I  will  now  add  a  further  abstraction  to  the  repertoire  used  in  the  plan 
diagram  notation  This  new  feature  called  controi-and  data-pathways  will  allow  the 
apprentice  to  recognize  programs  as  instances  of  plans  to  which  they  bear  little 
immediate  resemblance  This  will  allow  us  to  represent  commonalities  at  a  more 
abstract  level 


Dependency  Directed  Reasoning 


9.1  Abstract  Flows,  Data  and  Control  Pathways  163 

Consider  the  following  program  which  computes  the  fiinge  of  a  tree  using  a 
breadth  first  traversal. 

Qf  r lng»( tr«»  ) 

(prog  (Act  Mod*  Q) 

(»*tq  0  (Ingutu*  tr**  0)1 

lp  (co«d  ( (Ou«u*  C«pty?  0)(<"*tvirn  Act)) 

(t  (i*tq  Nod*  (0*qu*u*  0)) 

(cond  ((l«*f 7  Mod*  )( it tq  Act  (coni  Nod*  Acc ) ) ) 

(l  (Inqucu*  (lift  Nod*)) 

(Enqu*u*  (right  Nod*)) 

(go  lp)))))) 

A  close  examination  of  the  temporal  behavior  of  this  program  reveals  that  it  is  quite 
similar  to  the  r«iNc.t  program  of  the  previous  section.  In  both  programs  there  is  a 
traversal  of  the  nodes  of  the  tree,  a  filtering  of  these  nodes  to  select  the  leaf  nodes, 
and  an  accumulation  of  the  leaves.  This  accumulation  is  the  output  of  both  processes. 
In  fact,  the  only  significant  difference  between  the  two  programs  is  the  order  of 
traversal  of  the  nodes  of  the  tree.  However,  since  the  library  plan  for  tnu-tnavinsai 
makes  no  commitment  to  the  order  of  traversal,  it  ought  to  be  possible  to  recognize 
ofNiNtt  as  an  instance  of  the  "fringe  plan". 

On  the  other  hand,  there  are  or  'ous  superficial  differences  r«iNGt  is  doubly 
recursive  m  its  surface  syntax,  whereas  o**i*oi  is  a  loop  (singly  recursive),  otninge  has 
an  explicit  queue  while  fain&i  has  no  similar  explicit  data-structure.  Given  the 
superficial  clues  present  in  o»«ingi,  any  reasonable  recognition  process  would  first  guess 
that  o»*iN6f  is  an  instance  of  the  ovkucano- process  plan  shown  below. 


For  Complex  Program  Understanding 


Ou*u«  and  Pi-ocei»  Plan 


Consider  the  plan  diagram  for  owM-Mo-noccss.  The  property  of  thit  diagram 
which  seems  most  useful  is  that  for  any  enqueued  object  there  is  an  occurrence  of 
ft  out  ut  in  which  this  object  is  dequeued.  Furthermore,  every  object  dequeued  is  input 
to  an  occurrence  of  Knew.  In  summary,  there  is  a  sequence  of  occurrences  taking 
each  enqueued  object  to  an  occurrence  of  *ctiow  Such  a  sequence  of  events  is  merely 
an  abstract  data-flow;  when  we  draw  data-flow  links  in  a  library  plan,  what  is 
concern  is  that  the  object  gets  from  one  segment  to  the  other.  The  actual  method 

Dependency  Directed  Reasoning 


a,  a, 


9. 1  Abstract  Flows,  Data  and  Control  Pathways  16S 


transmission  is  of  little  concern  as  long  as  we  can  believe  that  no  significant  property 
of  the  transmitted  object  will  be  lost  in  the  process. 

We  will  call  such  an  abstract  data-flow  a  data- pathway.  A  data-pathway  may  be 
formally  defined  as  follows-- 

There  is  a  data  pathway  of  Ob j- 1  from  application  App-I  to  application  App-2  if  there 
is  a  set  of  applications  A  and  a  set  of  data-flow  links  L  such  that: 

(i)  Each  data-flow  link  of  L  connects  two  applications  of  a. 

(ii)  Each  application  of  a  either  initiates  or  terminates  a  link  of  u 

(iii)  Under  the  temporal  ordering  imposed  by  the  links  of  i,  a  has  both  a  gib 

and  and  lub. 

(iv)  o«j  i  is  an  output  object  of  app-i  which  flows  to  an  input  of  the  gib 

of  A 

(v)  o«j  i  is  an  input  object  of  app-j  which  flows  from  an  output  of  the  lub 

Of  A 

These  conditions  intuitively  state  that  there  is  a  set  of  causally  connected  events  in 
which  OBJ  i  flows  out  of  the  first  one  and  into  the  last  Typically  applications  will  be 
connected  by  data- flows  of  an  object  which  contains  obj-i  as  a  sub-structure.  The 
queue  in  the  above  program  plays  this  role. 


For  Complex  Program  Understanding 


The  Recognition  Paradigm 


I  he -tree 


IOue 
Frlrty 


Oueue- 
|Fr 1 rtgt - 
I 


J  I  he  F  r  i  nge 

Breadth  First  Fringe  Progran  Using  a  Queue  and  Process  Pti 


Breadth  First  Fringe  Program  Using  A  Queue  and  Process  Plan 


Dependency  Directed  Reasoning 


9.1  Abstract  Flows,  Data  and  Control  Pathways  167 


It  is  easy  to  prove  by  computational  induction  that  for  any  object  which  is  a 
member  of  the  input  queue  of  an  occurrence  of  queue -anopbocess-i  there  will  be  a  data¬ 
pathway  to  an  occurrence  of  psoass-wwA.  It  is  also  easy  to  show  that  otbinge  is  an 
instance  of  ouiot  »»p  pbocess,  mapping  o»ninge-i  onto  qabo-pbocess-i  and  pbocess-nooe  onto 
pbocess  M«h*i».  Thus,  for  each  member  of  the  queue  in  an  application  of  there 

will  be  a  data  pathway  via  a  sequence  of  enqueues  and  dequeues  to  an  occurrence  of 

PBOCESS  *001. 

If  wc  regard  these  data-pathways  as  data-flows,  we  may  then  construct  a 
recognition  mapping  between  « mot  and  f bingi.  If  tbee-i,  the  input  node,  is  non¬ 
terminal  then  the  ust  u»»  segment  in  qebinge  will  bring  control  to  pbocess-non-tebhinai. 
un  will  extract  itn  »,  the  left  node  of  tiem,  and  *oc  will  make  art-#  a  member  of 
the  queue.  bight  and  *qb  will  act  similarly  on  bight  s,  the  right  node  of  tbee-b. 
Control  now  passes  to  qenibg'-i  (i.e.  we  return  to  the  beginning  of  the  loop),  with 
both  it»r.»  and  bight  *  members  of  the  queue.  By  our  remarks  above,  each  of  these 
flows  via  a  data-pathwav  to  another  occurrence  of  pbocess-nooe.  Let  us  call  these 
occurrences  p*  i  and  p*  ?  respectively.  We  can  map  pn  i  onto  lee t ■  r binge  and  pn-i  onto 
bight  TsiBf.i,  thus  completing  the  recognition  of  o*bin&c  as  a  ebihge  program. 

The  use  of  pathways  becomes  quite  important  in  more  complicated  queue  and 
stack  based  programs  such  as  procedural  deduction  systems  which  rely  on  pattern 
directed  invocation.  In  these  programs,  processes  communicate  by  making  assertions  in 
a  data  base  possible  triggering  other  programs  into  executioa  The  notion  of  a 
pathway  makes  the  description  of  this  mechanism  far  more  concise  than  would  be 
possible  otherwise.  Furthermore,  it  allows  the  system  to  understand  such  demon-based 
programs  in  terms  of  the  communications  between  processes  rather  than  in  terms  of 
the  mechanism  of  communication.  Pathways  are  analogous  to  the  sometimes  notion  of 
Manna  and  Waldinger  (Manna  k  Waldinger,  1976]  in  that  they  speak  of  control 
reaching  a  certain  point  at  some  future  tune,  rather  than  immediately;  however,  I  use 
pathways  as  an  abstraction  tool  which  transforms  one  plan  diagram  into  a  second, 
more  easily  recognized  diagram. 

Because  pathways  allow  this  greater  flexibility,  all  flow  statements  of  a  library 
plan  may  be  matched  by  pathways  implemented  in  the  surface  plan.  Library  plans  are 
stated  in  terms  of  data  abstractions,  specs,  purpose  links,  data  and  control  pathways. 


For  Complex  Program  Understanding 


168 


The  Recognition  Paradigm 


Section  9.2:  Summary 

I  he  methodology  I  have  outlined  relies  on  developing  a  library  of  pre-proven 
plan  fragments  which  capture  substantial  parts  of  the  knowledge  of  an  expert 
programmer.  Once  such  fragments  have  been  catalogued,  the  effort  of  program 
analysis  may  lie  reduced  to  that  of  recognition  I  have  not  discussed  what  heuristics 
would  guide  such  a  recognition  system,  interesting  work  in  that  direction  is  being 
conducted  by  Rich  [Rich,  1977)  and  Waters  [Waters,  1977).  Instead  I  have 
concentrated  on  the  how  such  a  process  would  interact  with  the  reasoning  capabilities 
of  the  program  analysis  system. 

Our  method  might  be  challenged  on  the  grounds  that  it  involves  as  much  work 
as  more  standard  approaches  to  program  verification  However,  this  work  is  factored 
in  two  ways  which  are  highly  significant  First,  we  divide  the  task  into  (1)  Pre- 
provmg  frequently  used  standard  plan  fragments  whose  logical  aralysis  need  never  be 
repeated,  and  (2)  Recognizing  the  occurrences  of  these  fragments  within  more  complex 
programs. 

Second,  the  recognition  process  itself  is  factored  into  many  discrete  steps,  such 
as-  (I)  Showing  that  each  of  the  proposed  segments  of  the  surface  plan  satisfies  the 
t\|v  constraints  of  the  library  plan,  (2)  Showing  that  the  data-flows  of  the  surface 
plan  implement  the  data- pathways  of  the  conceptual  plaa  While  the  total  amount  of 
work  invohed  might  lie  substantial  it  is  separated  into  "bite  sized"  pieces;  furthermore, 
the  framework  allows  the  reasoning  system  to  self-consciously  concentrate  on  the 
particular  goal  at  hand  at  that  moment  While  attempting  one  particular  recognition 
all  other  parts  of  the  program  can  be  ignored. 


Dependency  Directed  Reasoning 


10  Description  of  Data-Structures  169 


Chapter  10.*  Description  of  Data-Struotures 

So  far  1  have  presented  methods  of  describing  various  components  of 
programming  knowledge;  I  have  also  shown  the  various  reasoning  techniques  used  in 
REASON  to  operate  on  these  descriptions  In  this  chapter  I  will  develop  a  language 
for  describing  the  static  properties  of  data-structures.  In  the  next  chapter  (which  will 
discuss  side-effects)  I  will  dcscrit>e  REASON’S  methods  for  reasoning  about  how 
properties  of  data  structures  change. 

Knowledge  about  data-structures  is  a  key  component  of  the  knowledge  base 
needed  by  the  programmer's  apprentice  system  This  knowledge  is  used  by  the 
synthesis  and  recognition  systems  in  a  declarative  form  In  the  reasoning  part  of  the 
system,  descriptions  of  data  structures  are  used  in  a  more  active  or  procedural  manner. 

Data-structures  are  perhaps  the  most  flexible  items  in  the  apprentice's  knowledge 
base;  programmers  routinely  devise  new  data-structures  and  new  methods  of 
implementing  them.  It  is  crucial  that  a  convenient  language  be  developed  for  the 
description  of  data-structures.  This  language  allows  the  programmer  to  tell  the 
apprentice  about  new  data-structures,  their  decomposition  into  parts  and  the 

constraints  which  these  parts  must  satisfy.  Finally,  it  allows  the  programmer  to  devise 
new  relationships  which  might  be  true  of  the  new  data-structure  and  to  provide  the 
apprentice  with  definitions  of  these  new  relations. 

The  data  description  language  has  two  main  features  First,  it  is  syntactically 
declarative,  allowing  the  programmer  to  describe  objects  without  having  to  know  the 
rule-based  syntax  of  the  deductive  system  The  declaratives  are  translated  by 

REASON  into  rules  which  actually  make  the  deductions.  Second,  the  description 
language  allows  hierarchical  knowledge  structures  and  inheritance  of  properties.  An 
•us*  can  be  described  as  a  special  kind  of  ust  which  in  turn  is  described  as  a  special 
kind  of  »tcu«sm  snucTim.  Properties  are  describe!  at  the  most  general  level  possible. 
Typically  the  programmer  need  only  state  that  the  structure  he  is  designing  is  a 
'r*‘»l  case  of  some  other  (or  of  several  other)  structured).  Each  aspect  of  the 
nhrnfrd  behavior  of  the  newly  defined  structure  is  then  mapped  down  from  the 

;  t  ent  structured).  Usually,  only  a  few  new  properties  are  involved  in  the  definition 

•4  «.is  new  object. 


— i4m  r-  -tf  im  Undemanding 


170  Description  of  Data-Structurcs 


Section  10.1.'  The  Duta-Description  Lftnguftg* 

At  the  lowest  level,  a  duta-structure  is  something  which  has  specific  parts  subject 
to  certain  constraints.  For  example,  a  list  is  usually  described  as  having  a  tirst  and  a 
»tsT  subject  to  the  restriction  that  the  rest  must  be  either  another  list  or  a  terminator 
suih  as  an.  A  unary. tru,  similarly,  has  a  tin  and  a  risnt  where  both  are  required  to 
be  either  binary. mis  or  urhinais.  These  ideas  are  represented  by  two  types  of 
assertions;  part  and  type  restriction- 

(**r  t  objtct  typt*  (part-nam>) 

(  ^ypt  rti  friction  (vp  ir(h«M>  < 06 jtc t  typ« > J  {object- typo)  ) 

For  example, 

lUl  f  iril ) 

( t  Oil  »#«») 

(  Typ»  MllritlH"  (R*>t  ll»t)  mi) 

Object- ty pes  are  sets  of  similarly  structured  objects  such  as  list  or  iiaary-trii  and 
are  subject  to  the  normal  set  operations  of  union,  intersection,  set  difference,  etc  In 
particular,  we  use  the  subset  relation  extensively  to  build  a  hierarchy  of  knowledge 
For  example,  lists  are  liniai  objicts,  and  rro«rty.usts  are  u;ts.  This  is  stated  as 
follows: 


(Subset  list  1 tntgr  object) 

(Subset  proper ty* 1 ist  list) 

Two  useful  pieces  of  information  are  extracted  from  such  statements  by 
REASON’S  rules.  First,  anything  which  is  a  subset  of  an  object-type  is  itself  an 
object-type;  second,  anything  which  is  a  member  of  a  subset  of  an  object-type  it  a 
member  of  the  larger  type  as  welL 


Dependency  Directed  Reasoning 


10. 1  The  Data- Description  Language  171 


(rul* 

((  •  (lubut  lublypt  luptr  t  ypf  ) )  ) 

(•Start  ' ( ob jtct ■ lyp«  lublypt) 

*( Sublet ■ '»p I  1*1 • typ*  •))) 

(rul*  ((  •  (tuhttl  subtype  supertype)) 

(  b  (type  object  subtype))) 

(•start  ’(type  object  supertype) 

'(type-ebam  e  b))) 

Object-types  are  often  combined  to  yield  new  types.  For  example,  we  might 
want  to  hate  an  object-type  which  includes  both  rwtr-  and  bOb-jufty-iisTS.  This 
object- t\|te  might  be  called  ust.  However,  it  is  also  useful  to  distinguish  between 
iwir-  and  twty  ust;  therefore,  we  also  have  the  two  specialized  object-types  called 
(Mf  Tr  ust  and  *cm  i »pty  i isi.  Obviously,  us t  is  the  union  of  the  other  two  object- 
types;  in  addition  the  intersection  of  ikpty-ust  and  »o»-iw>ty.list  is  the  null  set  Such 
a  situation  is  so  common  in  defining  object-types  that  I  have  distinguished  it  with  the 
special  name  partition  We  write  this  as  follows. 

(partition  IUI  into  (l«pty-U«t  ben.  (apty-l  lit )  ) 

from  which  it  follows  that: 


(Subtet  l»pty  Oil  Oil  | 

(Subltt  bon  l~rly  l»«t  lilt) 

(Union  ((mply-lMt  bon  Empty  l  II  l )  Hit) 

(  Int*r  >«c  l  ion  (l*ptjli|t  bon  l»«pt y  •  1 1 » l )  bult-Stt) 

It  is  usually  insufficient  to  know  only  that  an  object-type  is  partitioned;  wc  also  need 
to  know  how  to  distinguish  between  the  sub-types.  The  partition  of  ust  into  ikuy 
and  bob  iwmy  exhibits  a  very  frequent  and  common  method  of  distinction,  namely  that 
one  of  the  sub  types  has  a  more  detailed  part  structure.  Sometimes,  however,  a  more 
it  volvrd  criterion  is  used  to  distinguish  sub-types.  For  example,  we  may  partition 
lists  into  cvuic  ust  and  acyclic  list 

using  a  much  more  complicated  criterion.  Two  syntactic  extensions  to  the  partition 
statement  are  provided  to  facilitate  the  stating  of  these  restrictions. 


For  Complex  Program  Understanding 


|72  Description  of  Data-Structures 


The  first  of  these  is  the  allows  clause,  written  as  follows 


(►•Milton  lilt 

into  <e«pty-lut  nonempty- i  nt ) 

(•Moot  non  r«pty-1i|t  (ftril  r»it))) 

The  allows  clause  says  that  the  named  sub-type  (iioNtNm-iiST,  in  this  case)  includes 
the  part  names  mentioned  in  its  set  of  part  names;  furthermore,  this  is  a  distinguishing 
characteristic  within  the  given  partition.  In  the  example  above  this  means  that  aoa- 
iHPir  usis  can  be  distinguished  from  ihmv-ujts  by  the  presence  of  either  a  nasr  part 
or  a  *ist  part. 

We  can  now  give  a  simple  description  of  usn 

(Partition  lilt 

into  (*«ply-1iit  non- rnpty- 1  lit ) 
lllloii  non  mply-llit  (flrit  roil))) 

( Port  lilt  t  irit  ) 

(  **•»■  t  lilt  Bflt) 

(  typ«  loitrlctlon  (l«it  lilt)  lilt) 

similarly  we  can  give  a  description  of  iirary-trus  as  follows 

(firht'M  S  inorj  •  t  r  f  r 

into  (ttrwi.nl  Bon- icrvimt ) 

(*!!>••.  .'ion  - 1  •  r®  i  n*  I  (lift  K 1  f  h  t )  ) ) 

(Par!  8  nary  •  Ir»«  l»M) 

(P*<t  8'noiy-trrr  Right) 

(lyP'Rfl  Inc  lion  (left  8tn*-y. Tree }  ilntryfref) 

(  '  »tie -Rei  Ir 'tt  ion  (Ri()ht  B1rv»ry-Tr»«  )  6in»ry-tr«»l 

If  the  criterion  which  distinguishes  between  sub-types  is  more  complex,  we 
express  this  with  a  dividing  criterion  clause.  For  example,  we  can  tell  acyclic -cists 
from  cyclic -lists  by  the  absence  of  any  sub-list  which  is  a  sub-list  of  itself.  This  is 

stated  is  follows: 

(r»M  it  ion  lilt 

into  (»eycltc-ti»l  cyriu-1'»t) 

(S»*i<tn*-crit«f  1**- 

(fgr-tll  (  lofe)  (us-llit  »cyctic-Hit  :»ob) 

.'net  (rrnppr -lufc.  1  lit  :lub  toe))))) 


Dependency  Directed  Reasoning 


10.1  The  Data- Description  Language  173 


The  data  description  language  uses  several  notational  conventions  which  are 
shown  hi  this  example.  The  dividing  criterion  clause  above  mentions  acvcuc-iist. 
This  use  of  a  type  name  within  a  statement  of  the  data-description  language  is  an 
implicit  quantification  over  any  object  of  that  type:  I  will  explain  this  notation 
further  in  the  next  section.  In  the  case  of  the  oivioimc  ciihiim  clause,  however,  the 
meaning  is  quite  simple.  Any  object  of  the  partitioned  type  (say  list)  which  satisfies 
the  oivipiw.  cbihaic*  (the  ic*  au  statement)  can  be  deduced  to  be  of  the  sub-type 
mentioned  in  the  c«ni«io*  (actuic  usi). 

Now  let  us  move  on  to  other  definitional  statements.  Closely  related  to  the 
notion  of  papi  is  that  of  momo  p*»t  which  is  used  to  describe  objects  like  arrays 
having  many  similar  sub  structures  distinguished  by  numerical  indexing  rather  than  by 
name.  The  index  is  jiermitted  to  be  any  tuple  of  integers,  but  in  practice  I  will 
almost  always  lie  talking  about  single  dimensioned  iMouto-suiucTutts.  ibocxio-pabts,  like 
PA»n,  are  subject  to  type  restrictions  This  is  written  as  follows; 

(  Indtatd  p*r  1  obJ»cl  lyp»>  <  mdf  •  td  -  p*r  t  «•»«') 

(  Typ*  IfitMtlicn  InndiMd  pirt  npMl  <sbjtct-lyp«>)  <objict-typ«>) 

Thus  an  array  of  integers  would  be  described  as  follows; 


(OtJ«d  IjrJI  Integer  Arr»y) 

(Indexed  Integer  Array  H*~) 

(  Typp  battr tel 'on  (lte«  Integer  Array)  Integer) 

Some  restrictions,  however,  are  of  a  type  which  is  difficult  or  awkward  to  describe  as 
type  restrictions.  For  example,  the  object  used  to  index  an  i*pixro-sr«uciu»t  is  required 
not  only  to  be  an  integer  tuple  of  the  appropriate  "a-rity”  (a  type  restriction),  it  is 
also  required  to  be  within  the  correct  bounds  Describing  this  as  a  type  restriction 
would  require  the  creation  of  an  object-type  for  every  range  of  integers.  Instead 
these  more  complex  restrictions  may  be  stated  directly  with  a  require  statement: 


( Beaut r# 

(lt*«  Integer ■ Arrpy  1r>d«»  object) 
(  Integer -Arrey  tndei)) 


For  Complex  Program  Understanding 


174  Description  of  Data-Structures 

which  states  that  if  an  object  is  used  as  the  selecting  position  of  an  iNocxto-rMT 
statement,  then  it  must  be  a  valid  index  of  the  object  whose  component  it  select*. 
The  notion  of  i«otx  must,  of  course,  be  defined  elsewhere  (see  later  section  on  relation 
definitions).  Rcquirr  statements  are  invariants  which  state  that  any  time  their  first 
clause  is  true,  their  second  clause  must  be  as  well  Thus,  there  are  two  ways  to  make 
use  of  the  information  encoded  in  such  a  statement  The  first  is  to  deduce  the 
consequent  from  the  antecedent;  the  second  is  to  require  that  the  consequent  hold  any 
time  the  antecedent  is  realized  through  a  side-effect 

itpi  Risiaicnofts  may  be  regarded  as  a  special  case  of  mouim  statements  in  which 
the  antecedent  is  a  put  or  i»oi«o  put  statement  and  in  which  the  consequent  is  an 
osjici-typi  statement  In  fact,  both  types  of  statements  are  translated  into  REASON 
rules  in  a  rather  straightforward  manner  which  makes  this  clear.  For  example,  the 
above  #ioom  statement  leads  to  the  following: 

(■u»*  ((  •  (Hr*  '«  ind*i  obj)) 

(  b  (  Ijpt  II  <nd*i*d‘  1 1  rwetur*  ) ) ) 

(bllbrt  '(|nd*«  II  ind*«)  ’  (  rvquirt  ;«  b  c  ) ) ) 

where  c  is  the  fact-name  of  the  mooim  statement  Similarly  a  rm-«tsratcrro«  would 
be  translated  into  the  following: 

(bul*  ((  •  (KM  bt  nod*)) 

(  b  (tjrp*  bt  binary- tr«* ) )) 

(*»l»rt  '(fyp*  nod*  b m«ry • t r»« )  '(typ*-r*i|  *  b  t))) 

where  c  is  the  fact-name  of  the  TmmsuicTto*  statement 

Giten  these  means  for  describing  the  component  structure  of  an  object-type  we 
may  now  go  on  to  define  properties  and  relations  of  data-objects.  Consider  a  »i«a«y- 
mt;  we  would  like  to  describe  what  it  means  to  be  a  »oot  of  such  a  tree.  We  do  this 
with  a  simple  recursive  definition:  a  *oot  of  a  »ai-uwiin  iinuy-tru  is  either  the  mt 
itself  or  a  »ooi  of  the  u*t  of  the  mt  or  a  »oot  of  the  iismt  of  the  tree.  The  only  aoot 
of  a  b nuv* Tttt  is  the  mi  itself,  this  is  represented  as  follows; 


Dependency  Directed  Reasoning 


10. 1  The  Data- Description  Language  17S 


F-l  (  ft*1bt  lonptf  mi  t  ion  (nod*  non-t*r*nn«1  b^ir/Uu) 

<•>  (or  (id  non-t*r*in*l  binary. (r**) 

(nod*  [l*f|  nont«r»in«l]  bin»ry-tr*«) 

(nod*  [right  non- terminal  ]  bm*ry- tr*«) ) ) 

*•1  (»*1*tion  Ocfmition  (nod*  terminal  bmary-tre*) 

<•>  (id  terminal  binary- tr**)) 

Notice  th.it  assertions  of  the  dafa-desertption  language  do  not  use  REASON  variables 
(e.g.  no*  but  only  simple  identifiers  (eg.  non-tf  rmiml).  This  is  a  notational 

convenience  which  is  possible  since  REASON  translates  these  statements  into  REASON 
rules  which  do  use  variables. 

There  is  a  second  notational  shorthand  involved  here.  Each  tomtit  ic*  (Le. 
everything  other  than  the  logical  operators  and  the  predicate  names)  used  in  a  data- 
description  statement  is  an  objcct-irat  (e.g.  imur-mt,  nob-tmniMi,  etc.).  This  allows 
relations  to  be  poly  morphic,  i.e.  one  relation  (such  as  noum)  may  apply  to  many 
different  pairs  of  object-types.  >«««(*,  for  example,  is  a  relation  which  holds  between 
lists  and  objects,  hash  nms  and  titsits,  austs  and  dotho-raus,  etc.  The  use  of 
object-types  as  the  lomtifims  of  the  ruatio*  otfmmon  restricts  the  applicability  of  the 
definition  to  exactly  those  cases  where  the  objects  involved  satisfy  the  type  constraints 
implied  by  the  lomTinmv  (lomtintRs  may  actually  be  subscripted  object-types  such  as 
no*  MbMinAi  i  in  those  cases  where  there  is  more  than  one  object  of  a  type  mentioned 
within  a  single  relation).  It  is  important  to  remember  that  this  notational  principle 
applies  only  to  statements  of  the  data  description  language,  not  to  REASON’S  rules. 
NV  hen  data-description  assertions  are  translated  into  the  rules  of  the  reasoning  system, 
the  requirements  implicitly  represented  by  the  tomtit  urs  of  the  data  description 
statement  are  made  explicit  as  triggers  of  the  rule 

mATton  Mftbirnws  are  used  in  a  number  of  ways  in  REASON,  corresponding  to 
the  various  ways  in  which  implications  can  be  used  in  logic  systems.  The  simplest  of 
these  is  the  substitution  of  a  right  hand  side  for  the  left.  REASON  translates  the 
above  definition  info  the  following  rule  which  will  make  this  substitution  if  requested 
to  do  sex 


For  Complex  Program  Understanding 


176  Description  of  Data- Structures 


(««»• 

((  #  (»»p ((nod*  nl  bt)  .*)))) 

(lull 

((  •  ((node  nt  bt)  :•)) 

(  b  ( ( non  •  tons  mil  nt)  ;•)) 

(  C  ((binary, trot  bt )  •))) 

(Anort  '(Or  ( td  nt  bt) 

( ( nodt  S[I»M  nt]  it  bt)  :») 

((nod#  '[right  nt)  li  bt)  :«)) 

•(•#i-d«f  *  be  r-i)))) 

In  this  rule  there  are  two  levels  of  invocation.  The  outer  level  is  triggered  by  jui 
explicit  request  to  expand  the  assertion  This  creates  the  inner  rule  which  checks  to 
see  if,  in  fact,  the  assertion  to  be  expanded  is  believed  and  then  checks  to  see  if  the 
Twf  co*sr»4i*rs  are  satisfied.  If  so,  the  definition  is  asserted  and  justified  by  the 
triggering  facts.  However,  if  the  rm- com  nun  is  are  not  satisfied  then  this  rule 
represents  an  inappropriate  definition  and  no  assertion  is  made. 

A  second  use  of  ruamo*  mmritiom  is  in  the  antecedent  deduction  of  a  relation 
such  as  HtMbit  from  the  facts  corresponding  to  the  right  hand  side  of  the  definition. 
For  example,  the  following  is  the  standard  recursive  definition  for  membership  in  a 

list. 

1-3  (R«lollon-0«r  mit  ion  (H««»b«r  Ull  Object) 

<•)  (or  (firit  lut  objict)  (**«b»r  [r«it  Hit)  objlct))) 

1  his  definition  has  the  following  antecedent  use:  if  we  know  that  an  object  is  the 
nasi  of  a  usr,  or  if  we  know  that  it  is  a  w*ntt  of  the  rcst  of  the  usr,  then  we  can 
infer  that  if  is  a  at«  of  the  usi.  This  corresponds  to  two  rules  which  REASON 
creates  from  this  definition 


Dependency  Directed  Reasoning 


mi  The  Data- Description  Language 


177 


(Rut* 

(<  •  (<f«rtt  I  :•)  *)) 

(  b  ( ( lypa  :1  li»t)  :«))) 

(aiiart  ‘[{mtmbtr  1  o)  »)  ’(Rll-daf  •  b  f-J))) 

(Rule 

(( :•  < ( r «s  t  1  :r)  »)) 

(  b  ((»a»bar  r  o)  »)) 

(  c  ( ( lypa  '  1  »»t )  » ))) 

(•atari  ‘(("••"'bar  I  o)  :t)  ■(••*-RrT  '■*  *  ;e  :d  f-J))) 

Notice  th.it  in  the  second  rule  REASON  translated  the  reference  expression  [Rtst  list] 
into  a  pattern  in  the  trigger  set  of  the  rule.  All  reference  expressions  within  the 
clauses  of  a  ruatior  ouiritior  are  handled  in  this  way;  recursively  nested  expressions 
are  brought  out  to  the  top  level,  howir  in  a  hash-table  is  such  a  relation.  A  hash-tarle 
is  an  ibpcxto  structure  in  which  each  component  (called  a  bucket)  is  a  set,  represented  by 
some  dafa-structure.  There  is  a  functional  relationship  called  mash  which  maps  ertries 
into  ihdices  of  the  table.  An  ertry  is  a  hewer  of  the  tabu  if  it  is  a  hehber  of  the  rocket 
into  which  it  MAsnes. 

t  -•  (  R«  I  at  ion  OiT  tm  t  to»  (»#»btr  huMibli  anlry) 

<•>  (»r»btr  (bucket  haiMabla  [baib  k«try]]  anlryj) 

This  definition  is  translated  into  the  following  rule. 


(Rule 

((  •I  ((Hawbar  b  a)  «)> 

(  tl  ((lypa  a  entry)  »)) 

( . a]  ( ( type  b  bucket )  %)) 

(ad  ( ( beih  e  tndet )  :  > ) ) 

(  aS  ((bucket  t  :»nda»  :b)  »)) 

( :it  (type  t  baibtable)  »)) 

(eiiert  ,(("'«"'l>tr  t  a)  :») 

'(reldef  el  »2  el  a*  aS  at  T-t))) 

Such  rules  arc  built  by  first  replacing  the  reference  expressions  by  equivalent 
existentially  quantified  statements  and  then  transforming  the  statement  in  clausal  form. 
Each  disjunct  of  the  resulting  expression  is  then  translated  directly  into  a  rule  with 
the  set  of  conjuncts  forming  the  trigger  set. 


For  Complex  Program  Understanding 


178  Description  of  Data-Structures 


We  should  note  that  aiutiom,  like  paatj,  are  subject  to  Tm-acsTaicrioNS.  These 
restrictions  are  represented  with  the  same  notation  as  rm-ncsTaicTto«s  on  pa«ts  and 
imkxid-mats.  For  example; 

(  Typ*  Rotrict  ion  (nod*  b ln»ry - 1 r»*  )  bm*ry-tr**) 

which  would  be  translated  into  a  rule  as  shown  earlier. 

It  is  a  rather  common  practice  to  build  up  new  data-types  by  imposing 
restrictions  on  already  existing  types.  A  common  example  is  that  of  a  LISP 
association  list  {a-iist  for  short)  which  is  a  list  all  of  whose  members  are  exits.  We 
indicate  this  as  follows 

(tf«r  in  mg- restriction 

( tubi*t  *1 1 it  lilt) 

(lyp*-r*ltricti©n  (**i*0«r  at  lit)  p*lr)) 

Several  distinct  types  of  information  are  extracted  from  such  a  statement  First,  there 
is  the  obvious  su»su  relation  between  alist  and  list,  and  the  fact  that  the  tyfi- 
s.’»t«iciio«i  applies  to  the  aust  data-type.  A  simple  rule  adds  these  new  facts 


<  rut* 

((  •  (  d*f  inmg- r«i  tr  let  ion  iubl(t-f*ct  :  ty**-r«i-f»ct  )  )  ) 

(  01 1  * r t  iubl«t-f*cl  '(Uf-rilt  •)) 

(•tl«rt  typ* - r*i - f *ct  ‘(d*f-r*it  •))) 

However,  there  is  a  further  piece  of  information  in  the  ocfiaing-acsthictioii  which 
is  used  in  a  consequent  (backward  chaining)  manner.  For  example,  if  one  wishes  to 
show  that  an  object  is  an  aujt,  one  possible  strategy  is  to  first  show  that  it  is  a  list 
and  then  show  that  it  satisfies  the  defining  restriction  of  having  only  earn  for  wmtas. 
The  following  rule  embodies  this  strategy. 


Dependency  Directed  Reasoning 


lOtl  The  Data- Description  Language  179 


( ru ■  •  ((  »t  (go«t  ((Odjtct- type  object  «lut)  :i)  ter  got)  in  :contt»t)} 

(  »l  (  def  ining  ■  r*itr  iction  (tubllt  •lilt  lilt)  ( type -r«»tr let  1*"  (Maker  Atilt)  PAlr)))) 

( propose  method 

‘(Method  *1  (  def ining- rei tr  ict ion  :  a2 ) )  '(def-eilt  :el  :•*) 

(Assert 

'( conjunct ive- go»l j  (object-type  :object  list) 

( { for- alt  (el)  ((member  :object  :el)  :s) 

((object-type  et  peir)  :•))) 

(  ell 

(((object-type  :object  slist)  :s)  .  god) 

;  contest ) 

'(defining. restriction  a  2 ) ) ) ) 

where  the  con juNcmt- goal  mechanism  is  explained  in  Chapter  4.  Notice  that  if  the 
system  docs  conclude  that  an  object  is  an  *ust,  this  conclusion  will  not  depend  on  the 
control  assertions  created  during  the  process  but  only  on  the  objicr-irfi  assertion,  the 
fo«  all  assertion,  and  the  otf  ini ng -restrict ion  statement. 

Section  10.2:  Parameterized  Objeot  Descriptions 

The  statements  I  have  shown  so  far  allow  us  to  state  specific  features  of  a  data 
structure;  however,  a  descriptive  method  which  "chunks"  such  statements  into  larger 
parcels  of  knowledge  would  tie  desirable.  Such  chunks  of  knowledge  can  then  be 
organized  into  a  library  of  programming  skills  (Rich  &  Shrobe,  1976],  (Bars tow,  1977] 
in  such  a  way  that  generalities  arc  conveniently  captured  and  specialized  as  needed. 
The  basic  unit  of  description  in  this  system  will  be  a  parameterized  object  description , 
which  is  a  collection  of  statements  describing  the  structure  of  a  data-object.  The 
parameters  in  an  object  description  allow  it  to  describe  a  family  of  related  data-types. 
The  object  types  s£T-or-NUK3i»s,  su-or -lists,  and  uNRistatcuo-stts  are  all  defined  by  the 
same  parameterized  description,  only  the  choice  of  parameters  is  changed.  This  is 
quite  similar  to  features  found  in  CLU  (Liskov,  et.  aL  1977]  and  ALPHARD 
(IVulf.et.  aL  1976] 


Tor  Complex  Program  Understanding 


180  Description  of  Data-Structures 


(Object  •  type  •  def  mi  l  ton  Set 

(parameter*  N««btri  ,jipi  ( typ: - re* trict ion :  object- type ) ) 
(partition  Set  into  ( non • empty  -  let  eapty-iet) 

(Allow*  Ron-Empty-tet  (Member))) 

(Relation  (Member  set  member*  type  ) ) 

(  type  reitnct  ion  (Member  let)  Membert-type) 

(Relation  (Union  Set  Set  Set) 

(definition  (Union  S-l  ’-I  S-l) 

<•>  (Equiv  (el)  (or  (comber  s-l  :et)(member  s-l  :e  T ) ) 
(member  s-J  :•>)))) 


) 

The  fo«iv  quantifier  above  is  merely  an  abbreviation  for  two  universal  quantifiers, 

one  in  each  direction.  REASON,  in  fact,  treats  these  as  abbreviations,  translating 

them  into  the  two  quantifiers  when  the  object-type  definition  is  read  in. 

In  this  definition,  su  take-  a  single  parameter  which  is  required  to  be  an  object- 

type  name.  This  name  is  then  used  at  various  points  in  the  definition,  for  example  in 

the  •  -t  msiRiciio*  statement.  Such  parameters  act  like  the  arguments  to  a  macro 
generator;  the  type  definition  for  set  may  be  invoked  with  a  specific  parameter  causing 
a  more  specific  type  to  be  created  in  which  the  parameter  is  instantiated  The 
following  illustrates  the  syntax  for  such  invocation: 

(»•!  (  -  ~»b«  r  >  -  t  ypt  po*  n>»»br  ' ! 

which  means  the  more  specific  object-type  which  consists  of  sns  whose  hemsers-tym 
parameter  is  r  s  >*  o  Presumably,  res  *  «*!»  is  a  valid  object  type  and  therefore  this 
new  type  is  as  wclL  Inspection  of  the  description  indicates  that  this  new  object-type 
allow  tinly  rts-wecus  as  MEMeies,  since  the  rvrr  parameter  is  used  in  the  ttpe- 

®!Si«:  M^i  statement.  Similarly, 

s»t  '  •wr'T'bf '  i  -  typ*  )  1 ) 

would  lie  the  object  type  ot  mts  of  ncgatoc  numbers.  The  convention  is  made  that 
unspecified  parameters  assume  a  default  value  of  unrestricted ,  so  that  the  object-type 
specified  as  mt  with  no  parameters  is  treated  as  a  st r  whose  "(users  may  be  of  any 
objcct-tvpo  Notice  th.it  parameters  are  given  with  tvpf.RtsTgicriois;  in  practice 

parameters  are  usually  restricted  to  bo  either  object  type  names  or  numbers. 


Dependency  Directed  Reasoning 


10. 2  Parameterized  Object  Descriptions  181 


To  avoid  writing  out  the  rather  cumbersome  type  expressions  above  I  have 
included  an  is  a  statement  to  give  a  name  to  such  expressions.  Thus,  we  could  write: 

(object  •  l/pt-dtf  in  1 1  ion  Sft  of  -  Poi  I 

(i»*o  (itt  (w*o»e  (member  i  type  poi  *  mm*btr  i ) ) ) ) ) 

Whenever  an  oejtci  im  expression  is  used  (either  in  is  *  clauses  or  elsewhere)  an 
instantiation  process  is  invoked  to  add  this  new  object-tjpe  into  the  hierarchy  of 
object- 1\  jies.  This  invocation  process  consists  of  the  following  steps: 

0.  When  the  definition  is  first  entered  an  object-type  is  created  whose  name  is  the 
name  given  in  the  definition  statement  and  whose  parameters  are  the  default  values. 
This  will  be  referred  to  as  the  base-type,  e.g.  sit. 

1.  The  new  type  is  given  a  name.  If  used  within  an  is  *  clause  the  name  is  taken 
from  the  clause.  Otherwise,  a  default  name  is  created. 

2.  The  new  type  is  declared  to  be  a  subtype  of  the  base-type,  oimking-mstrictions 
are  added  to  reflect  the  effect  of  the  specified  parameters.  For  example,  in  the  str- 
of  •  ros  rs  object-type,  the  oif  iRiRG-RtstsicnoR  is  the  typi  risirictior  of  HtustRS  to  ros- 

RUHSfRS. 

3.  Parameter  values  are  substituted  for  parameter  names,  and  the  new  object-type 
name  is  substituted  for  the  base-type  name.  Each  statement  within  the  definition  is 
then  processed.  Those  whose  parameters  are  specified  are  asserted;  those  with  default 
values  are  not. 

4.  If  there  were  unspecified  parameters,  the  partially  instantiated  object-type 
definition  is  added  to  the  catalogue  of  object-type  names  It,  in  turn,  may  serve  as  a 
prototype  for  further  instantiation. 

It  is  often  convenient  to  be  able  to  describe  a  related  family  of  objects,  using 
techniques  similar  to  those  above.  For  example,  we  might  want  a  parameterized  way 
of  describing  rwric  numis,  using  the  upper  and  lower  bounds  as  parameters.  Each 
instantiation  of  this  description  is  a  particular  object  (not  an  object-type,  as  above), 
but  each  such  object  differs  only  in  minor  regards  from  other  such  objects.  Such 
descriptions  are  made  wjih  a  parameterized  object  description 


For  Complex  Program  Understanding 


182 


Description  of  Data- Structures 


(  Ob j*c  t  <fef  mi  t  >on  Humer  it  -  Jnl#r«*l 
(P«r*m*ltri  (Upper-bound  ( t yp« - rtitr u t  ion  Integer) 
l  over -bound  ( type - rei tr let  ion :  Inttgtr)) 

(rcitnction  (Itu  (htn  lower-bound  Upper -bound )) ) 

(lie  (Set  (whose  (Me«beri ■ type  Integers)))) 

(definition  ( Nt*«er  ic  -  Interest  («ho»e  (upper-bound  U )  ( lower-bound  ;  l))) 

•  (  el|  ( And  (let  I- then  el  U)(greeter-then  :#1  L )  > ) ) ) 

Like  object- type  descriptions,  object  descriptions  take  parameters  subject  to  certain 
restrictions.  The  is  *  clause  allows  us  to  give  an  object-type  name  to  any  object 
described  by  this  parameterized  description.  Finally,  the  definition  clause  states  an 
equality  which  defines  what  objects  are  named  by  the  current  description. 

Object  descriptions  are  processed  as  follows; 

(0)  A  new  object-type  is  created  using  the  name  in  the  description  as  the  type-name, 
(for  example,  nuhiric  irurvai  above). 

(1)  If  there  is  an  is  a  clause,  the  object-type  name  is  asserted  to  a  sub-type  of  the 
type  named  in  the  clause. 

(2)  Any  invocation  of  the  description  which  leaves  some  parameters  unspecified 
creates  a  new  object  type  (for  example,  numeric-intervals  whose  lower  bound  is  0). 

(3)  An  invocation  which  specifies  all  parameters  creates  an  object  which  is  asserted  to 

be  equal  to  the  object  specified  in  the  definition  clause.  Finally,  if  the  parameter-set 

for  this  object  is  a  more  specific  set  than  the  one  used  to  create  the  next  less  specific 
object- type,  then  the  object  is  asserted  to  be  of  that  object-type.  (For  example,  the 

numeric  interval  whose  upper  bound  is  5  and  whose  lower  bound  is  0  would  be 

asserted  to  lie  of  the  object-type  of  numeric  intervals  whose  lower  bound  is  0  and 
w  hose  upper  bound  is  unspecified). 


Section  10.3:  Implementation  and  Virtual  Objects 

One  object  can  be  used  to  simulate  the  behavior  of  another.  In  fact, 
implementing  more  abstract  data-structures  using  simpler  ones  is  a  great  part  of  the 
effort  in  symbolic  programming  Most  often  we  use  data-structures  to  represent  basic 
mathematical  concepts  such  as  sus,  sioumcts,  wumus,  crams,  and  hulti-sits.  However, 
there  are  a  host  of  ways  to  build  any  of  these  representations.  sits  may  be 
represented  by  lists,  arrays,  or  mash- twits .  happircs  by  iists-or -pairs,  pairs -or -arrays,  hash 
TASUS-OA  PAIRS,  etc. 


Dependency  Directed  Reasoning 


1QJ  Implementation  and  Virtual  Objects 


183 


The  data  description  language,  therefore,  requires  two  further  extensions.  First 
we  must  h.ne  a  means  of  describing  an  implementation  method;  second,  we  need  a 
way  of  stating  that  a  particular  abstract  object  is  implemented  using  a  particular 
method  Let  us  consider  the  classic  example  of  implementing  a  stack  using  an  array 
and  a  poirhb.  (What  thesis  could  be  complete  without  a  stack  implemented  as  an 
array  and  a  PomuR7)  Let  us  say  that  we  define  a  stack  to  be  an  object  with  two 
parts,  a  top  and  a  history,  where  the  history  is  required  to  be  another  stack  (ihpty  or 
*oh  ihpiy).  We  may  describe  this  as  follows: 


(  Imp  lr«i«nt  *1  ton  Hr  I  hod  Strck-Ai  Array 
( Abt  trac  t  -  Otijrc  t  Stack) 

(Concrata  ObjacTt 

(  Implrmp r>1  mg  Array 
( l/pa  rat t r  u t ion 

(Array  (vhota  ( »e«btr t  t ypa  [  H»»b« r t • typa  Stack)))))) 

(  Imp  ’  r»»M  trg  Pomttr 
(typa  raitrict'on  Intagar)) 

(raitrtction  (  lnda>  |mpl»m«M  mg-Array  Imp)  »m«M  ing  •  Po  tnWr  ) ) ) 

(Rapratanl  (Too  Stark  Objact) 

(  daf in  1 1 ion  (top  S  0) 

•*  (Item  l»p  I  a'tai't  titg  Array  l«pl  aoant  tng .  Pointar  0))) 

) 

Before  going  the  rest  of  the  method  let  us  look  at  what  we  have  so  far.  We  say  that 
one  object  may  be  represented  by  the  behavior  of  a  set  of  other  objects  using  the 
abstract  objict  and  cohcri u  object  clauses.  These,  of  course,  include  typi  and  other 
restrictions  which  establish  the  pre-requisite  conditions  of  the  representation.  We  then 
describe  how  each  of  the  undefined  relations  of  the  abstract  object  is  mapped  onto 
relations  involving  the  concrete  objects.  This  is  done  in  the  riprisirt  clause  using 
equivalence  statements  like  those  used  to  define  relations 

An  implementation  method  implicitly  defines  a  new  object-type,  namely  objects 
of  the  abstract  type  which  are  implemented  according  to  the  specific  method  This 
object-type  is  gixen  the  name  of  the  ihpu*rtatioii.nithoo.  Thus,  we  can  build  the 
following  REASON  rules  which  relate  properties  of  objects  of  the  abstract  type  to 
properties  of  objects  of  the  concrete  types; 


For  Complex  Program  llnderstanding 


184  Description  of  Data-Structures 


(rulo  ((  fl  ( ( Ob jac t  -  typo  SI  Slack-Ai-Array)  tit)) 

(  ft  ( ( Implamant ing-Array  SI  :A)  : s tt ) ) 

(  M  ((  laptoMnliftg. Pointer  SI  I)  :»U>) 

(  f*  (Oto»  A  I  0)  .lit))) 

(Assart  '((top  SI  0)  sit) 

•(  laplaowntation  fl  tt  tS  :t *  :fS))) 

(rula  ((  ff  (tipand  ((tap  SI  0)  lit)))) 

( rula  ((  fl  ((Objact-typa  SI  Slack-A>-Array )  :sit)) 

(  ft  ((Top  SI  0)  sit))) 

(Aitart  ‘((itaa  [  Inploiaant  ing-Arroy  :S1] 

[  laplmcnting-Polntar  SI]  0)  lit) 

'((■paitf-lap  fl  ff)))) 

Notice  that  each  concrete  object  in  an  inpum»TATio«-neTnoo  is  given  a  name;  this 
allows  us  to  refer  to  the  concrete  objects  by  their  description  e.g.  "the  impumcrtias- 
ar«av  of  stack  «r.  This  bears  such  a  similarity  to  the  ways  in  which  parts  are  used 
that  I  refer  to  this  as  an  implementation  part.  When  a  new  numoo  is  described, 
assertions  are  added  to  the  knowledge  base  stating  what  the  implementation  parts  are: 

(  lament  at  ion  Par  l  Stack • A, -Array  |i»p1r»»nt  ingPomter) 

(  tmpltmentat  10"  Part  Stack -Ai -Array  )aplt*«nt mg- Array ) 

The  corresponding  tvpi  pistpictio*  statements  are  also  added  to  the  knowledge  base. 
These  are  then  used  in  exactly  the  same  way  as  are  papt  statements  and  their  rm- 
■isiPicTioos.  Also  notice  that  the  object-type  implicitly  defined  by  an  ihplcncntation- 
pitTHOP  is  a  subtype  of  the  abstract  type.  When  a  new  method  is  entered  into  the 
system  an  assertion  like  the  following  is  added  to  the  data-base: 

(Ho  tf«od- f  or  Stack  Stack  At-Array) 

this  triggers  the  following  rule; 

(■ulo  ((  f  (fkothod-for  typa - 1  typo-f))) 

(Anort  '(Sub tat  typo -t  typo-1) 

'(motfiod  i«pl  loi-lubiat  f  ) ) ) 

We  still  need  to  define  how  the  history  (sur-stacO  part  of  the  stack  is 
implemented.  Intuitively  we  want  to  say  that  the  history  is  the  stack  implemented  by 
(1)  The  sa*  array  and  (2)  The  number  which  is  one  smaller  than  that  of  the  current 
stack.  In  the  data  description  language,  therefore,  we  let  the  itwiiwHTATion-ntTwoo  name 
be  a  special  operator.  A  predicate  beginning  with  an  inpumintatior-ncthoo  name 


Dependency  Directed  Reasoning 


10. 3  Implementation  and  Virtual  Objects  185 

includes  a  an  object  name  and  a  whose  clause  for  each  iw-MUTATioA-fAtT-aAHC  For 
example, 


( Stack -At -Array  (vhota  ( Imp  I  moot  mg -Array  A1 )(  l*p  Iwaant  mg- Aomlor  :  N)))  Stock-1) 

means  that  stack- j  is  the  stack  implemented  by  the  array  ai  and  the  number  a;,  using 
the  method  stack  as  a*»ay.  Thus,  we  may  now  describe  how  the  history  is  represented 
by  stating: 

(lapraiant  (Mutory  Stack  Stack) 

(  tfOf  in  1 1  ton  (Mutory  si  SI) 

<•>  ( Stack -Aa -Array  (wha»a 

(  |»p  t *««n t ' "8  Array  [  |xp)t**nl  mg-Arr*y  S1J) 

(  |*o  1  mgPomltr  ; 

[plui  i  J laploaontmg-fomUr  S)]]))  SI))) 

Using  the  techniques  I  have  shown  so  far,  this  is  turned  into  the  following  rule: 

(•ula  ((  fl  ( (Objoct ■ l/pr  SI  Stack -Ai-Array)  lit)) 

(  *1  ((  l*o 'chanting  Array  SI  A)  lit)) 

(  TJ  ( (  l«p' e»>«r>T  mg  Aomt*r  SI  HI)  lit)) 

(  f«  ((flltl  I  *1  Nt)  lit)) 

(  T S  ( ( Ob jac t • t ypt  St  Stack  At -Array)  lit)) 

(  it  ((  l*T’»»«nt  tng  Array  St  A)  lit)) 

(  ii  l  (  l*0't«>»nt  mg  Oomtar  Si  at)  lit))) 

(AnaM  '((Hiltory  SI  St)  l>t) 

•( '">etn»«"tation  f  1  ft  fl  M  fs  it  ft))) 

The  #i»»!siaT  clauses  in  iaauac»tatioa  aitmoos  are  handled  almost  exactly  like 
•iiATioa  ptnaiticas.  In  the  next  chapter,  I  will  discuss  how  REASON  analyzes  side- 
effects,  discussing  the  problems  posed  by  defined  relations  extensively  at  that  point 
The  reader  should  bear  in  mind  that  by  defined  relations  I  mean  both  kiatioh- 

dimajtioas  and  the  equivalences  given  in  the  atfKtsiat  clauses. 

Before  moving  on  to  a  brief  catalogue  of  object-type  descriptions,  I  should 
observe  that  the  terms  abstract  object  and  concrete  object  used  in  iaaliaiatat  iox-hi taoos 
are  a  bit  of  a  misnomer.  Indeed,  sometimes  ahats  are  used  to  implement  other  aaaays, 
making  the  notion  of  abstract  and  concrete  fuzzy  at  best  This  type  of  representation 
is  hidden  in  the  above  example  about  stacks,  where  we  talked  about  the  mistoay.  This 

is,  in  fact,  nothing  but  another  auay  represented  as  a  sus  amay  of  the  first 


For  Complex  Program  Understanding 


186  Description  of  Data-Structures 

It  is  a  well  known  technique  (used  in  sorting  programs  for  example)  to  divide  a 
single  *«*ay  into  so* aabays,  using  indices  to  separate  the  conceptually  distinct  "virtual" 
arrays.  Thus,  given  an  array  and  two  numbers  (a  and  s)  which  are  indices  of  that 
array  we  can  represent  another  array  of  size  (s  •  *)  as  follows:  The  ith  element  of  the 
virtual  array  is  the  (•  ♦  oth  element  of  the  concrete  array.  This  is  expressed  as 
follows-- 

(  Implement  at  ion  method  Array  -  Segment 
(Abilrect -Object .  Array) 

(Concrete  Objectl 

( Implementing- Array  ( type  -  ret tr ic t  ion 

(Array  (wtiote:  (memberi-type:  [oim*irt-typo  Array]))))) 

(Lower- Indea  ( type ■ ret tr let  ion  Integer)) 

(Upper- In  -r.1  ( type  -  ret tr let  ion  Integer)) 

(reitnction-  (>  [tut  Implement  mg- Array)  [ilia  Array)) 

(Indea  Implement  mg- Array  lower-lndea) 

(Indea  Implementing-Array  Upper- Indea) 

|>  Upper- indea  Lower-lndea) 

(Mut  lower-indea  (Sire  Array]  Upper- Indet) ) ) 

(repretent  (item  array  nuaber  object) 

(definition  (Item  A  I  Obj) 

<•>  (item  Implement  mg- Array  (glut  lower-bound  ■]  Obj)))) 


Dependency  Directed  Reasoning 


10.4  A  Catalogue  of  Object  Descriptions  187 


Section  10.4:  A  Catalogue  of  Objeot  Descriptions 

So  far,  I  have  introduced  a  number  of  mechanisms  for  the  description  of  objects. 
In  this  section  I  will  present  a  systematic  development  of  REASON’S  knowledge  about 
data.  First,  I  will  state  the  basic  mathematical  knowledge  about  sers,  namings,  etc 
Then  I  will  define  several  basic  programming  data-structures,  leading  up  to  hits,  tuis 
and  AssociAiivt  «t«iivAi  mash  tajus  like  that  coded  in  the  scenario. 

1  will  begin  b>  defining  sit  as  an  object-type  with  hchm«snin,  union,  inunsiction, 
etc  relationships. 


(  Oto  jec  t-type- definition  Set 

(parameters  Mrff.br  r  i  •  type  ( type  -  res tr let ien :  Object-type) 

SUe  ( type  -  ret tr let  ion  Pol  Number)) 

(sortition  S«t  into  ( Non - (mp ty - Se t  (mpty-Set) 

(•11ow<  Non  (»pty ■ Sot  (member  ))) 

(■•lotion  (Member  Set  Members - type ) ) 

( type  •  r#  J  t  r  1C  t  ton  (Me«t*r  Set)  Members . type ) 

(  Set  t  r  ic  l  ion  (Ceromelity  Set  Size)) 

(■elation  (Unon  Set  Set  Set) 

(definition  (Union  SI  S-l  SI) 

<•>  (Iquie  (  el)  (Or  (Member  S-l  el)(Member  S-l  :*1|) 

(Member  S-J  el)))) 

(■elation  (  Inter  sect  ion  Set  Set  Set) 

(definition  (  Interaction  J-l  S-l  S-J) 

<•>  (Iquie  (  el)  (And  (Member  S-l  el)|Hember  S-l  :•))) 
(Member  S-l  el)))) 

(■elation  (Set  Minus  Set  Set  Set) 

(definition  (Set  Minus  S-l  S-l  S-J) 

<•>  (louie  (  el)  (And  (Member  S-l  :al)(«at  (Member  S-l  :«1 )  > ) 
(Member  S-J  :•!)))) 


For  Complex  Program  Understanding 


188  Description  of  Data- Structures 


(Relation  {(quel  Sat  Sat) 

(definition  (Iquel  SI  Si) 

<•>  (Iqui*  (el)  (Meaner  S-l  :il) 

(Neater  S-f  :•*)>>) 

(Relation.  (Site  Sat  Poi-Ruaber) 

(definition  (Ski  S  R) 

<■>  (If  (Object-type  S  (apty  Sat) 
than  (Site  S  •) 
alia  (Far-all  (Header  S  :et) 

(Sue  (Set-*tnvi  s  <  :•!)) 

[Ulna.  R  »))))))> 

Notice  that  in  the  definition  of  sizt,  I  introduced  the  braces  (<  ...))  notation  for  set 
presrntatioa  REASON  (using  the  macro  character  facilities  of  MacLisp)  translates 
this  notation  into  the  form  which  is  used  internally.  Set  presentation  with  braces  can 
take  either  of  two  forms,  extensional  or  intentional ;  in  the  former  the  set  is  presented 
by  listing  its  members  within  the  braces: 


(1  i  s  *i 

Intentional  presentation  defines  the  members  of  a  set  as  those  objects  which  satisfy  a 
formula  with  one  free  variable.  The  variable  is  given  before  a  vertical  bar  |  and  the 
sentence  afterwards  For  example,  the  set  of  pigs  with  wings  is  Presented  as  follows: 


<  i|  (and  (pig  i)(  winged  :i))) 

In  the  internal  form,  these  are  represented  as  follows: 


(tetennonat-Sat  (lilt  af  Object.)  Set) 

( Intentional -Sat  loanable  Fred  Sat) 

far  aiampla. 

(( itani tonal -Sat  (1  }  S  t)  S-I) 

(  Intentional-Set  ■  (and  (pig  :»)( winged  :■))  S-2) 

which  say  respectively  that  s-i  is  the  sit  of  the  numbers  1  2  5  and  9  and  that  s-<  is 
the  set  of  pigs  with  wings. 


Dependency  Directed  Reasoning 


1  Ol  4  A  Catalogue  of  Object  Descriptions  189 


The  following  set  of  REASON  rules  interpret  these  assertions: 

(dulo  (|f  { Intent ion*) -Set  »er  prod  lit))) 

(tiuM  *  ( object • type  lit  Set)  *(  Int-$et • Type  :  f ) ) 

(dulo  ((  t  prod)) 

(Miert  '(*e«iber  :Slt  Ver )  '  ( Int-Set-Moa  f  fl ) ) ) ) 

(dutp  ((  f  (  Intentional  -Sdt  »er  prdd  let)) 

(  g  (dot  prod))) 

(trnrt  '(dot  (Hm*tr  lot  *0r))  ' ( Int-Sot-dot-Noa  f  |))) 

(tut*  ((  f  ({ • teni tonol -Set  Hit  ill))) 

(dliert  '(object  type  Hi  Sot)  '  ( 1 1 1  -  Si  t  -  typo  f)) 

(Hope  '(lambda  (•) 

(Snort  '(Header  lot  ,«)  ' (lit •  Sot -H«a  1 ))) 

Hit)) 

(dulo  ((  t  (ibo«  (not  ( »t«bo r  tot  object))  bjr  (oit-oot) 
for  Qoo 1  in  contoit)) 

(  g  (  eitoni lonal - tel  Hit  lot))) 

(Or  (member  object  lilt) 

(AneM  '(dot  (*e»bir  iot  object))  ‘(eit-iet-net-aea  •)))) 


(dulo  ((  t  ( pool  (Hamper  iot  obj)  tor  goo!  in  contoit)) 

(  g  (  lnt  on!  ionol  ■  Sot  »er  prod  lot))) 

(fropoie  Helhod 

'(Hotnod  t  (Int-Sot  g))  '(IntSot  f  g) 

(lot  ((  nco-pred  (lublt  obj  vor  prod))) 

(goil-ottert  non-prod  ' ( (Hamper  iot  :obJ)  .  pool) 

contoit  '( Inl-iet- tvbgoel  t  •))))) 

where  the  ;ihove  is  a  consequent  rule  which  is  used  only  when  requested  It  simply 
states  that  a  sufficient  sub-goal  is  to  show  that  the  object  satisfies  the  defining 
predicate. 


For  Complex  Program  Understanding 


190 


Description  of  Data-Structures 


The  next  mathematical  object-type  in  REASON'S  knowledge  base  is  natmmv 

(Object- type  def inition  Mapping 
(parameter!  domain- type  ( type-reitr let  ton :  object-type) 
rang#  type  ( type -raitnet ion :  object •  typa ) ) 

(ts-a:  (Set 

(whoie  (member  i  ■  type  ( anac  tat  ion 

(vtioto  (toy-type:  domain-  *ype) 

(value-type  range-type))))))) 

(■elation  (Image  Mapping  domain-type  range-type) 

(definition  (Image  Map  Key  Value) 

<•■>  (mere- ti  (  el)  (Member  Map  :ol) 

lueb-tbat  (And  (Key  el  Key)(Value  el  Value))))) 

(■alotion  (Domain  Mapping  (Set  (whole.  (Member!- type .  domain- typo ))) ) 

(definition  (Domain  Map  0) 

<«i  (  Id  0  (  a  1 1  (Thera-u  (  at)  (Member  Map  :at) 

iuch- that  (Key  :ai  el)))))) 

(■elation  (Range  Happing  (tet  (whole  (member! • type  range  -  type ))) ) 

(definition  (tango  Map  ■) 

<•>  (Id  ■  (  e 1 1  (Thera-ll  (  at)  (Member  Map  :ei) 

tuch- that  (Value  :et  :■!))))) 

(■elation  ( Range  t lament  Mapping  Range  type) 

(definition  (Range  aliment  Map  ■) 

<•>  (Member  [Range  Map]  ■))) 

(Relation  (  domain  ( lament  Mapping  domain-type) 

(definition  (domain-eleoent  Map  ■) 

<•>  (Member  [domain  Map]  ■))) 

Notice  that  I  used  the  notion  of  an  associatior  (or  fair)  in  defining  a  mappim.  The 
following  defines  association 


(Object -type -definition  Aiiociotion 
( parameter!  Key-type  ( typo-reitnetion :  object -type) 

Volue-type  ( type-roitriction  object-typo)) 

(port!  Key  ( type  reitriction  Key-typo) 

Value  ( type-reitrict ion  Volue-type))) 

I  will  now  define  two  more  specific  kinds  of  mapriws,  namely  functions  and  i-to-i 

MAPP  |  HAS. 


Dependency  Directed  Reasoning 


10.4  A  Catalogue  of  Object  Descriptions  191 


(Objoct  typo  Oofmition  function 
( Oof ining- rot  trie t ion  { lubtypo  function  Hopping) 

{ roilrict  ion  (for-oll  (  0)  (doo*'n.|l#«*"t  Hopping  0) 

(thoro- it-o-uniquo  (:o»)  (Hot  or  Hopping  :oi) 
lucb-thot  (toy  :ll  0)))))) 

(Object • typo  d*f initi on  1  -  to •  I -Hopping 
(defining  reotriclion  (tub  typo  l  ■  1 0- 1  »epp  mg  function) 

(  rtilr let  ion  (for  oil  (  r)  (Penge-elttnt  funttlon  r) 

( thoro-  it -o  uniquo  (:e»)  (tter  function  :ai) 
luch-tnot  (veiue  : oi  : r ) ) ) ) ) ) 

Notice  that  since  fupenops  are  happipss  and  i.to*j-ha»i«ss  are  functions,  each  of  these 
may  be  invoked  with  the  dohaip-iypc  and  nwi  im  parameters  specified  in  the  hapmw 
definition. 

A  scout  net  is  a  mapping  from  a  puHtmc-iaitmi  to  a  sen 


(Objoct-typo-dof  mition  Sequence 
( Perot  ter  i  Jiy*  ( t ypo • root r let  Ion  Poi-lntogor) 

Hr~bfi  typo  ( typorottrict'on  Ob  joct-type ) ) 

(  1» ••  ( f  unc t ion 

(vt>e>o  (do»ein 

(■uaonc-  intereel 

(  wKo  i  o  ( lower  - 1  ml  t  I) 

(uppor-limt  two)))) 

(renge  type  Hoto •  t  • typo ) ) ) 

(rtrinr  (  Pongo  -  o  t  **ont  to  Hot  or  ) 

( do">om  o'»«ont  to  Indo*)))) 

Notice  that  I  used  a  rename  clause  within  the  is-a  expression  above  This  simply 
means  that  what  was  called  PAPot-ttcicoT  in  the  type  fuoctic*  is  called  hchscp  in  the  type 
scout  pci.  REASON  copies  in  the  old  definition,  substituting  hchscr  for  papm-ucwm. 

The  description  of  scoucpccs  makes  use  of  the  notion  of  a  awtcaic-iarttvai  which 
has  already  been  given  in  the  text;  I  will  repeat  it  here  as  welk 


For  Complex  Program  Understanding 


192  Description  of  Duta-Stmetures 


(Objec t -def inil ion  Rimer ic - Interval 
(Parameters  (Upper-bound  ( type- restriction :  Integer) 
lover-bound  ( type- re»  trie  lion :  Integer)) 

(reitnction  (leti-tban  lower-bound  Upper- bound )) ) 

(It  *  (Set  (whose  ( Member i- type  Integeri)))) 

(definition  ( Rimer ic ■ Interval  (whole  ( upper  -  bound :  U)( lower -bound :  l))) 

•  (  el  |  (And  (ten- then  el  U)(greeter-then  :el  1))))) 

A  stout  net  of  particular  importance  (particularly  in  describing  arrays)  is  a  stout  net  of 
positive  integers  of  a  given  size.  Since  I  will  use  such  objects  in  defining  arrays  I  will 
need  to  define  a  notion  of  one  such  stoutnet  being  within  the  bounds  established  by 
another  such  stoutnet  (e.g.  a  stoutnet  of  indices  being  within  the  array  bounds). 

(Object-type  definition  Sequence-ef -pot  -  integers 
(Parameters  Site  ( type - rei tr ict ion  Pol  -  Integer ) ) 

{111  (Sequence  (whole  ( Members  -  type  Pei  Integer) 

(Swe  ute))) 

(rename  I»age  to  Item)) 

(Relation  (  In  bounds  Sequence  -  of -Pol  -  integers  Sequence -of -Pol- integeri ) 

(definition  ( |n  bounds  S-l  J-?) 

<•>  (for  ell  (  indei)  (  Indei  S-l  Indei) 

(less-then  [Item  S-l  :lndei]  [Item  S-Z  :|ndei)))))) 

I  will  now  define  arrays  as  they  are  found  in  MacLisp  (all  dimensions  are  positive 
integers). 

( Ob  jet  t  - 1  ype  -definition  l  up- Array 
(Parameters  dimension  ( t vpt - res t r ic t ion  Poe-Integer) 

Upper-bound  ( type  -  res tr ict ion: 

(Sequence  of -Pot- Integer!  (whose:  (ilte:  dimension) ) ) ) 

Members  • type  (Type  restriction  object-type)) 

(  Indeaed  pert  Item  ( type -restrict ion  Members -type) 

( Indei - res tr ict 'on  Indes 

( type-restriction  (Sequence -of -Pel- Integers  (whose:  (site:  dimension)))) 

(restriction  (tftthin-bounds  Indes  Upper-bound)))) 

(Relation  (Indet-of  llip-Arrey 

( Sequence -of -Pol  -  Integers  (whose  (use:  dimension)))) 

(definition  (  Indei-of  Array  Seq) 

<•>  ( Wt  thin- bounds  Seq  Upper  -  bound ))) ) 


Dependency  Directed  Reasoning 


1Q.4  A  Catalogue  of  Object  Descriptions  193 


1  will  now  move  to  a  more  complicated  data-structure,  namely  hash-coded  data-bases. 
There  are  several  such  systems,  the  features  common  to  all  of  them  is  the  use  of  a 
hashing  »unct i ox  and  an  a# r ay.  I  will  start  by  defining  a  hash  function. 

( objec t • type • def t ion  heth 
(peremeteri  domein-type  ( type - re» tr let  ion .  object-type) 

»i*e  (  ty,,e  - rpltr  ic  1 10"  pot  -  integer ) } 

(  1 1  -  e  function  ( who  to  ( dome  in- type  dowem- typo ) 

(rtngt  ( mMr tc- intervel  : 

(utioie  (lower-bound  •  ) 

( upper -bound  me))))))) 

The  simplest  hashing  system  is  one  which  calculates  a  hash  from  the  entire  data-base 
item  The  next  simplest  is  an  associative  system  which  calculates  the  hash  on  the  wv 
part  of  the  data-base  item.  Both  these  insert  the  item  into  a  single  place  in  the  data¬ 
base.  A  mote  complicated  system  will  be  described  later. 

(Object  type-def  *mt  ion  bethteb'e 
IfHoMliri  n»»b»ri.  typo  (type  reitnctlon  object-type) 

»ue  ( t ype - re» t r ic t ion  pot-integer) 
h*iH  { l/ptrtilrtcl ion  (htih 

(whole  (domem-type  Hemberi  -  typo ) 

(me  HH)))I) 

(  1 1  -  •  (l'»p  »rr*y  (whole  (memberf- type 

(lit  (whole  ( member ■- type :  Hemberi - type ))) ) 

(me  nit))) 

(reneme  item  to  bucket  pent ) ) 

(•eletion  ( Member  heihteble  Hemberi •  type ) 

(definition  (Hember  ht  el) 

<•>  (Hember  [bucket-pert  bt  [f»eiH  el]J  el)))) 

Now  for  the  associative  version  of  hash- tasks: 


For  Complex  Program  Understanding 


Description  of  Data-Structures 


(Object-type  definition  oisoc let ive-heihteblo 
(Parameter!  key-type  ( typc-reitriction:  object-type) 

value-type  ( type-reitriction  object-type) 

•  ice  ( type- reitr  ict  ion  ■  pot-mteper) 
belli  (type-restriction 

(beih  (whose  (domain  type:  Key-type) 

(me  nee)))) 

(«»-• 

( I isp-errey 
(■hose  (member!  •  type 

(let  (whole  (member! - type 

(esiocietion  (whole  (key-type  key-type) 

(value-type:  velue-type )))))) ) 

(me  me))) 

(rename  item  to  buc ket - pert ) ) 

(delation  ("ember  heihtable  "tmberi ■ type) 

(definition  ("ember  ht  el) 

'•>  ("ember  [bucket  part  ht  [he»h  [key  el]])  el)))) 

So  far  I  hate  not  dealt  with  recursively  defined  structures  such  as  lists,  trus,  graphs, 
etc.  I  will  dexelop  these  by  first  defining  an  object-type  called  RtcuRSivt-sTRucTuRis.  I 
will  then  define  lists,  trus,  etc.  as  special  cases  of  this  object-type: 


Dependency  Directed  Reasoning 


1(14  A  Catalogue  of  Object  Descriptions  195 


(Objecttype-defimtion  Recurs  i  ve-  I  true  ture 
(partition  Recur s i ve - St  rue ture  into  (terminal  non- terminal  ) 

(all owi  non  terminal  (  lamed late • eh 1 1 dr»n ) ) ) 

...  a  rtcuriwi  structure  IS  always  built  fro*  the  'recurring  parti*  which  art 

caliid  the  non- terminals  and  the  ‘flopping  parti*  which  ara  callad  terminal*. 

(parameters  value  -names  ( type • reslr let  ion  act)) 

(parti  Immediate  Children  ( type  re  it r ic l ion 

(let  ( who  t • 

(member t -  type  Recur t iva ■ it  rut ture ) ) ) ) 

Values  (type  restriction  (association  (whole  (domain:  value -names )))) ) 

the  definition  is  paramo terued  by  a  set  of  ‘values*  which  are  other  field! 
present  at  each  node,  but  which  are  not  involved  in  the  recursion 

flow  make  the  banc  definitions  The  immediate  children  are  the  firit 
level  of  recursion,  i  e  the  nodes  pointed  to  directly  by  the  current  node 
(relation  ( immad i a  1  a ■ ch 1 1  d  recurs  iva  -  structure  recur  s  ive  ■  St  ructure ) 

(definition  (  unmed  iet  e  ch  i  id  rs-1  rs-f) 

<•'  (flrmber  [immediate-children  rs-l]  rs-J))) 

*  proper  node  is  one  gotten  to  via  an  lavnediete  child  link 
(relation  (proper  node  recursive  structure  recurs  iva - structure  I 
(definition  (Proper  node  rs-l  rs-f) 

(•>  (Or  (immediate-child  rs-1  rs-l) 

(there  'I  (  immediate  child  rs-1  -ic) 
such  that  (propernode  ic  rs-f))))) 

...  Rode  IS  the  transitive  closure  of  immediate  child  I  a.  anything  yau  can 
;..  get  to  by  first  going  to  an  immediate  child  and  than  Its  immediate  child,  ate. 
(relation  (node  recursive  structure  recur s iva • ft  rue ture ) 

(definition  (node  rs-1  nl) 

Ca»  (or  (proper  node  rs-1  rs-l) 

(  id  rs-|  rs-f )))) 


For  Complex  Program  Understanding 


96  Descriphon  of  Data-Structures 


A  liramil  nod*  it  •  nod*  which  U  *  tanainal  It  Itapi  th*  recursion. 

(  r*l  *t  ion  (t*min*l  nod*  ricurt  iv*  - 1 true tur*  racurilva- itrwetur*  ) 

(definition  (ttraintlnodt  ri-J  ri-/} 

«•>  <  tnd  (nod*  r|-l  rt -i  )( ok  JdC  t  •  typ*  r»-l  taronnal)))) 

Non- terminal  nod*t  *r«  th#  othdr  guyt 
(r«l»tion  (non  terminal -nod*  recur i iv* • itructur*  rtegri i*«- ttruc turd ) 

(definition  (  Non-  terminal  nod*  fl-l  r*-l) 

'•>  (  *nd  (nod*  r|-l  ||  Jllotjut  tjn  e»  •  l  non  terminal  ) ) ) ) 

If  all  th*  non  terminal  nod*i  hay*  th*  ■  an*  number  of  lamfiltl  chitdeon, 
thit  myahtr  ij  called  th*  nod*  d«grc*  Hitt  hav*  nod*  d*gr«*  1,  binary 
tree  I  ha*«  nod*  degree  I,  too*  graph i  hav*  no  dtgr*«  by  thu  d*f . 

(relation  (Nod*  d*gr*t  d»cur t i*t ■ t trgetur*  Pot- integer ) 

(definition  | Nod*  degree  tS  N) 

<•>  (for-all  ( Non- terminal -Nod*  *S  bod*) 

(Sit*  ( Iam«diala-Ch1ldr*n  Nod*]  M ) ) ) ) 

two  itructur*!  thar*  if  th«y  hav*  a  cannon  nod*  This  it  vary  important 
for  r* atoning  about  tide  effect! 

(relation  ( |her*i ■ Itructur*  rocuri  iv* • itrwetur*  racuri i va- itructur* ) 

(definition  ( th*r*t  Itructur*  ri-l  rt-i) 

<•>  ( f h*r» ■ 1 1  ( proper -nod*  rt-l  »*dr  ) 

tuch  that  (pr*p*r  n*4*  r»-l  nod*)))) 

if  you  can  find  a  nod*  lonowh*  «  in  thi|  itructur*  which  It  a  nod*  *f  Ittalf 
in  a  non-tnvial  way  (n*d*t  w*r*  define*  ta  b  n*d*i  *f  thonialvai)  than  th* 
Itructur*  h*t  eye  1*1  thll  l|  utu*Hy  V*ry  bad 
(  r« 1  at  ion  (hat-cycl*!  r*curt 1 *• • ttruc  tur*  ) 

(definition  (hat-cycl*!  rt-|) 

<•>  (Th*r*i|  ( nnd*  ri-l  nod*) 

tuch-that  (proptr-nod*  nod*  nod*)))) 

Structural  can  b«  divided  tnt*  that*  with  cytldl  and  thoi*  without 
not  1C*  that  itructur*!  now  hav*  two  different  partition!. 

(partition  recuri iv* • itructur*  into  ( eye > U - itructur*  acyc ) 1C -Itrwetur* ) 

(  div  iding- cr  itor  ion  (Mai-cyOil  eye  1 1c- itructur*) ) ) 


Dependency  Directed  Reetooinf 


1(14  A  Catalogue  of  Object  Descriptions  197 


;;;  Th#  rule  of  structural  induction  can  da  applied  to  any  acyclic  structure 
.  ..  to  prove  that  some  property  holds  for  all  of  its  nodas. 

(proof  -  rule 

( goal  -  Property 

(where  (occurs- in  Properly  vpr) 

(object-type  ver  acycl ic-struc lure) ) ) 

(subgoals  (for-all  (  lerm)(  terminal -node  acycl  ic-structure  term) 

.(subst  ten*  ver  property)) 

(for  all  (  non  ten*)  ( non • terminal -node  ecyct 1C • structure  non-tana) 

( implies 

(for  all  (  chi ld)( immediate -chi  Id  non-ten*  :chlld) 

.(subst  child  var  property)) 

.(subst  non  term  ver  property)))))) 

This  is  the  only  object  tt|ie  definition  so  far  where  1  have  found  it  useful  to  include  a 
p»oof  »uit  with  the  definition.  The  syntax  is,  therefore,  somewhat  ad  hoc  REASON 
builds  a  consequent  reasoning  rule  and  a  method-proposer  from  this  statement  The 
method  props >scr  will  trigger  if  there  is  a  goal  statement  which  includes  an  object 
which  is  an  acyu ic-si#uciu«i.  If  the  method  is  accepted,  it  creates  the  two  sub-goals 
of  showing  that  (1)  The  property  holds  for  all  ream  pal- boots  of  the  object  and  (2)  If  it 
holds  for  all  wouk -CHuotib  of  a  boot  then  it  holds  for  the  boot  itself.  Notice  the  use 
of  commas  (,)  in  front  of  the  suesi  expressions  to  indicate  that  they  should  be 
evaluated  (i.e.  subst  is  to  be  invoked  as  a  function;  it  is  not  a  predicate  name). 
stbociubal- iboucuob  is  also  used  on  b i **b v - t Bt t s  and  lists,  but  the  rule  is  simply  copied 
into  their  definitions  since  they  are  defined  as  special  kinds  of  ttcuRsm-STBuctubts. 
(Boyer  Jk  Moore,  1975,77)  uses  structural  induction  extensively  to  prove  theorems  in 
recursive  function  theory  and  Pure  Lisp 

Next  I  will  define  a  bibary  tbu  as  a  special  kind  of  Acruic-ttcutsm-ST«ucTu«  whose 
iHutoiAU  cmipRib  is  a  set  of  two  BibARv-tRits. 


( Objec  t  -  type  definition  (miry  tree 
(Bbfti  L*M  ( type - r*i tr ic t 'on  bmiry-trtf) 

bight  (  lypt  -  rtitr  1C  t  ion  b  mpry  -  \  rtl  )  ) 

(  I • -  a  ( *cyc  I  ic- itructur*  (whole  ( bed* -degree •  2))) 
(map  (  Immediet*  children  bmery-tr**) 
inlo  <[I*M  bmery- |r**  right  b  »n*ry .  t  r*»] )  ) )  ) 


For  Complex  Program  Understanding 


198  Description  of  Data-Structures 

Notice  the  use  of  the  map  clause  above.  Remember  that  whenever  a  object  is 
defined  with  an  is  a  clause  the  system  copies  all  the  information  about  the  super- type 
into  the  new  definitioa  The  hai>  clause  tells  REASON  that  any  expression  involving 
iMMtouu  cmiip«i»  m  the  definition  of  ncutsm-stiucruics  should  be  replaced  by  an 
identical  expression  involving  the  extensionally  presented  set  above.  This  can  be 
regarded  as  an  equivalence  saying  that  if  a  iisasy-tmi  is  thought  of  as  a 
Stamm  struciurc  then  the  set  of  inmcoiati-cmuobis  is  the  sit  composed  of  the  lift  of 
the  tree  and  the  bight  of  the  tree.  In  a  iibaby-tbu  the  inncoiati-cm’icmib  F«t  is  a 
virtual  object  composed  of  the  two  "real"  parts,  the  lift  and  the  *i6mt. 

Finally,  I  will  define  a  list  as  a  data-type  with  two  parts:  a  usst  and  a  mst. 
The  chain  of  ststs  forms  a  itcussivt-stsuctust.  There  is  a  substantial  private 
vocabulary  associated  with  cists,  which  is  indicated  by  the  rename  clauses.  As  above, 
the  hap  clauses  are  used  to  indicate  how  this  structure  satisfies  the  definition 
stcussivi  stsuctusis.  A  new  construct,  the  singleton  type  is  introduced  to  take  care 
the  fact  that  in  LISP  the  only  possible  ihpty.ust  is  the  special  object  «iu 
sisGutcs  typi  consists  of  one  object;  therefore,  any  object  known  to  be  of  that  type 
also  know  to  be  the  unique  object  of  that  type.  Finally,  several  relations  peculiar 
usts  are  introduced. 


Dependency  Directed  Reasoning 


o  ►  S,  S, 


10.4  A  Catalogue  of  Object  Descriptions  199 


( Objec t - type - del ini t ion  list 

Member  s-type  ( type  -  restrict  ion :  OP  ject  -  typp ) ) 

(Parti  First  Rest  ( typp ■ rpi tr ic t ion .  List)) 

(partition  list  into  (Empty-list  Non  -  (oipty- 1 1 1t ) 

(allows  Non  Empty-lut  (first  rast  member))) 

(singleton  type  Inptytist  object  Nil) 

(is  a  ticurimi  structure  (whose  (Node-degree  1)) 

( map  ( (  immediate  children  list)  into  ([rpit  list])) 

((waluas  list)  into:  ((First  (First  list)))))) 

(rename  (node  to  sublist) 

(terminal  to  empty-lilt) 

( non ■ terminal  to  non-empty- tilt ) 

(  non  - terminal  - nods  to  non-empty • subl  1st ) 

(  terminal -nods  to  empty- sub) ist )) ) 

(relation  (Member  list  Members  -  type ) 

(definition  (Member  (.  0) 

<•>  (Or  (f  irst  l  0) 

(Member  [rest  l]  0) ))) 

(function  (length  Hit  Rot-Integer) 

(definition  (length  l  N) 

<•>  (if  (Object  type  l  Impty- list) 
then  (Id  N  • ) 

else  (length  [rest  list]  [Minus  N  1]  )))) 

(relation  (Comet -before-in  list  Members-type  Me«*beri*type ) 

(definition  (Comes • before- in  l  0-1  0-2) 

<•>  (There-is  (Subtist  l  Sub) 

Such  That  (And  (first  Sub  0-1) 

(Member  [Rest  Sub]  02)))))) 

Notice  th.it  m  the  h*r  clause,  the  values  part  of  the  recursive -structure  is  mapped  onto 
the  pairing  of  the  name  *fiesf  with  the  first  part  of  the  list. 

The  final  object  I  will  present  is  the  CONNIVER-style  associative-retrieval 
hashing  system.  This  is  an  array  each  of  whose  items  are  lists.  The  array  holds 
assent  ions  which  are  rinary-trees.  An  assertion  is  a  mimrei  of  the  take  if  it  is  a  hihser  of 
each  rocket  hasnio  to  by  any  of  its  terminal  nooes. 


For  Complex  Program  Understanding 


200  Description  of  Data-Structures 


( Ob j«ct-t>pe  definition  Conn i»tr • haih- tobtd 
(fortMlort  two  ( typo -r«»trict ion  poi-intogor) 
both  ( tjrpo -  rot trie t ion : 

(both 

(■hold . 

(Oonoin-typo:  (polr:  (vhoit:  (loft:  *toa) 

(right:  bot-Huobor  ) ) ) ) 

(two  two))))) 

(W-0  (  1 1 1p  -  array  (whoia  (mainbtri  -  typo 

(tot  ( xtioto  (otonOo p '  *  typo :  binary- troo-of -at oat ) ) ) ) 

(two  two))) 

(ronomo  it««  to  buckat  port ) ) 

(■olot’on  (Honbor  haihtabla  I inary - troo -of -Atom  ) 

(dot  m  it  ion  (hrobor  ht  •  1 ) 

<•>  (tor-all  (  nodo)  ( ttronnal -nodo  o'l  nodo) 

(for-o)l  (  pot)  ( pot 1 1 ion  -  in  bt  nodo  Indoi) 

(bandar  (buc tot -port  ht  (both  nodo  :indoo]]  ot )))))) 

The  full  blown  programmer's  apprentice  system  will  have  an  even  more  extensive 
catalogue  of  descriptions;  however,  my  point  here  is  not  so  much  to  present  the 
complete  catalogue  used  in  REASON,  as  to  show  how  the  reasoning  system  gets  its 
information  The  compete  catalogue  used  by  the  apprentice  is  being  worked  on  by 
Rich  (Rich,  1977,78}  It  is  also  for  this  reason  that  I  did  not  include  spec-types  for 
the  various  operations  associated  with  object-types  as  is  done  in  data-abstraction 
languages.  The  final  catalogue  will  do  so,  but  for  the  reasoning  system's  purposes  this 
is  not  necessary. 


Dependency  Directed  Reasoning 


11  Reasoning  About  Side-effects  201 


Chapter  11:  Reasoning  About  Side-effeots 

The  ability  to  change  the  structure  of  an  object  while  leaving  its  identity 
unchanged  provides  a  powerful  mechanism  for  modularity  and  abstraction  in  advanced 
programming  languages  such  as  LISP.  However,  precisely  because  side-effects  on 
global  and  shared  structures  possess  such  potential  power,  they  also  allow  enormous 
room  for  error.  When  a  segment  causes  a  side-effect  it  saves  itself  the  worry  of 
communicating  with  a  hoard  of  other  segments  which  might  access  the  same  structure, 
but  it  does  so  at  the  price  of  requiring  an  assurance  that  it  is  safe  to  make  the 
proposed  change.  This  assurance,  unfortunately  can  only  be  gained  by  engaging  in  a 
non-local  and  expensive  form  of  reasoning 

Simple  changes  can  result  in  non- trivial  results,  something  every  experienced 
programmer  has  learned  the  hard  way.  Since  complex  structures  are  built  from  less 
complex  objects  it  follows  that  a  side-effect  to  a  part  of  a  structure  can  change 
properties  of  the  whole  structure.  Even  worse,  since  complex  structures  may  share 
sub  structure,  a  modification  to  one  data  structure  might  change  a  property  of  some 
other  data  structure  which  had  been  thought  of  as  a  completely  separate  object  In 
reasoning  about  the  results  of  a  particular  action,  REASON  must  assess  what 
properties  besides  those  explicitly  stated  will  also  change 

In  the  context  of  common  sense  reasoning  in  A1  this  general  problem  has  been 
termed  the  frame  problem  in  [McCarthy  and  Hayes,  1967  &  69]  and  has  received  a 
considerable  degree  of  attention  [Raphael,  1970],  [Hayes,  1971a  &  b}.  An  example 
will  illustrate  the  problem.  Suppose  I  tell  you  that  the  saucer  was  taken  to  the 
kitchen.  If  you  knew  that  the  cup  was  on  the  saucer,  then  you  would  probably  infer 
that  the  cup  was  now  in  the  kitchen;  the  inference  would  probably  be  correct  since 
there  is  a  causal  relationship  between  the  location  of  the  cup  and  the  location  of  the 
saucer  upon  which  it  is  placed.  However,  you  would  also  never  think  to  ask  whether 
the  saucer  had  changed  color  when  it  was  removed  to  the  kitchen,  since  motion  has 
little  to  do  with  position. 

In  common  sense  reasoning,  one  has  to  assume  that  most  things  don't  change 
unless  there  is  strong  reason  to  believe  that  they  da  When  such  assumptions  lead  to 
trouble,  one  re  examines  his  current  belief  system  and  rearranges  things  to  correspond 
to  the  realities.  In  the  case  of  program  understanding  similar  techniques  also  apply. 
Consider  the  following  procedure  for  swapping  the  first  element  of  two  lists  of 
numbers  without  using  a  temporary  variable: 


For  Complex  Program  Understanding 


202  Reasoning  About  Side-effects 


(d*fun  l**p  (li»t-l  Il»t-Z) 

(rplpct  >1*1-1  (-  (car  H*t-])(c*r  Klt-t))) 

(rplpc*  lilt-/  (♦  (t«r  tut-l)(c*r  Hit-/})) 
t rplaca  lt»t-l  (-  (car  1ut-/)(c*r  1 1st- 1 ) ) ) > 

a  brief  explanation  of  this  procedure  might  be  helpfuL  Suppose  that  the  c*«  of  list-i 
is  *  and  the  car  of  lisi-z  is  l  Then  the  sequence  of  additions  and  subtractions  leaves 
the  following  values  in  the  first  position  of  the  two  lists 


t  tit-  I 

Hit-/ 

r«*ipn 

initially 

• 

b 

fir»t  tub! 

a  *  b 

b 

addition 

a  •  b 

a 

b  ♦  (a  •  b)  • 

a 

?nd  tubt 

b 

• 

•  -  (•  •  b)  • 

b 

Interestingly  enough,  this  program  has  a  bug,  furthermore,  few  programmers  (even 
experts)  spot  the  bug  when  examining  the  program.  (The  reader  might  try  to  figure 
out  what  the  problem  is  now  before  proceeding). 

The  problem  is  illustrated  by  considering  what  this  program  will  do  if  called  with 
the  same  object  for  both  arguments: 


(i«*p  ii»t-zi  1 1  •  t  -  a  a ) 


Since  the  formal  parameters  list-i  and  ust-z  are  bound  to  the  same  object,  the 
procedure  fails,  putting  0  into  the  car  of  ust-ii  The  fact  that  most  programmers  fail 
to  spot  this  bug  indicates  that  they  are  assuming  that  the  two  arguments  are  distinct 
lists,  even  though  they  have  no  evidence  supporting  this  assumption 

This  indicates  that,  as  in  common  sense  reasoning,  programmers  use  more  than 
one  strategy  for  analyzing  the  effect  of  an  action.  In  the  more  reckless  strategy,  one 
assumes  that  things  are  not  ch  .ged  unless  there  is  reason  to  believe  they  da  This 
has  the  advantage  of  being  right  most  of  the  time,  requiring  less  effort,  and  allowing 
the  programmer  to  form  a  "first  order"  theory  of  what  the  code  does.  In  the  more 
careful  strategy,  one  does  the  opposite,  assuming  things  are  affected  unless  evidence  to 


Dependency  Directed  Reasoning 


1 1  Reasoning  About  Side-effects  203 


the  contr.tr>  exists.  This,  has  the  advantage  of  never  allowing  a  false  conclusion  to  be 
drawn,  but  the  disadvantage  of  requiring  much  greater  effort,  to  the  extent  that  it  can 
prevent  one  from  forming  a  "first  order"  understanding  In  developing  REASON,  I 
have  experimented  with  two  protocols  corresponding  to  these  two  forms  of  analysis.  I 
will  present  these  protocols  after  presenting  some  notational  preliminaries  in  this  next 
sectioa 


Section  11.1:  Specifying  Side-effects 

REASON  allows  special  kinds  of  spec  clauses  to  facilitate  the  description  of 
side-effects.  The  two  such  basic  clauses  are  siot-tmcT  and  «w  which  have  the 
following  format; 

(  S use -( rr ec  t  ehiiged-objpct  Input-Sit  Output-Sit  N«»-c1ew>«) 

(be»  *f»  C6|rtl  Input  S<t  Output  Sit  Neu-C > tut*)  v 

•v 

.  \ 

The  first  of  these  states  that  on  the  transition  from  the  input  situation  of  the  segment 
to  the  output  situation  of  the  segment  the  cm**gco  osjcct  is  subjected  to  a  side-effect 
which  makes  irv  ci*ust  true  in  the  output  situatioa  The  second  type  oN^ssertion 
states  that  a  new  object  is  brought  into  existence  (through  use  of  the  cons  or  related 
operations)  on  the  transition  from  the  input  to  the  output  situation  of  the  segment; 
the  new  object  satisfies  the  property  stated  in  xtv  cixust. 

Within  a  segment’s  specs  clauses  the  transition  part  of  these  statements  is  omitted 
since  it  can  unambiguously  be  inferred  by  the  symbolic  interpreter.  Thus,  in  specs 
one  may  write 


(S’d»  -*ff*ct  object  dtutt) 

(He»  object  d*ui«) 

Since  a  side-effect  changes  properties  of  an  object  we  would  like  a  simple  way  to 
talk  about  the  object  both  before  and  after  the  side-effect  is  performed.  We  already 
have  one  mechanism,  the  object  state  description  «object  situation))  which  allows  us 
to  distinguish  between  input  and  output  states  of  the  object  However,  for  notational 
simplicity  we  also  introduce  another  method.  We  allow  the  outputs  clause  to  create  a 


For  Complex  Program  Understanding 


204  Reasoning  About  Side-effects 
second  name  for  an  input  object,  as  fellows: 


(output!  ( n»«  n«M  itf-to  input  -n«M ) ) 

This  new  name  is  referred  to  as  the  output  name,  whenever  it  is  usrd  in  an  assert 
clause,  it  implicitly  refers  to  the  object  in  the  output  situation  Notice  that  spec 
clauses  are  usually  stated  without  explicit  situation  tags  since  these  can  be  provided  by 
the  symbolic  interpreter  using  simple  defaulting  rules 

The  specs  for  a  simple  side-effect  like  «piaco  can  be  stated  as  follows  using  this 
notation: 


(4*rip(tt  rp’oed 
(input!  •■tilt  ntwroit) 

(tiptet  (obj«ct  t/p«  ■•tut  tilt)) 

(output!  (  th#n«u  1  lit  id-to  Hit)) 

(ailtM  (|id»  tff«ct  ••tilt  (r»it  ttif-ntu- 1  lit  ntu-rpit )  )  )  ) 

In  evaluating  a  set  of  specs,  the  symbolic  interpreter  builds  a  mapping  which 
matches  input  and  output  ports  to  objects.  An  output  port  mentioned  in  an  io-to 
clause  is  bound  to  the  same  object  as  is  the  input  port  of  the  clause.  In  the  course 
of  interpreting  the  spec  clause,  the  interpreter  replaces  input  and  output  ports  by  the 
objects  to  which  they  are  matched.  Also  the  interpreter  examines  whether  there  are 
any  output  port  names  in  the  clause;  if  so,  and  if  the  clause  is  not  explicitly  tagged 
with  a  situation  name,  the  output  situation  is  added  in.  Thus,  assuming  that  rrlaco 
were  applied  to  ust-i  and  «ist-i  in  s  »,  resulting  in  the  new  situation  s-i,  the  above 
siot  truer  clause  would  actually  be  asserted  as: 

(lidp-ptrpct  1  lit  - 1  S-t  SI  (rpit  tllt-1  rpit-1)) 

This  defaulting  mechanism  is  quite  useful  in  describing  side-effects  which  relate  some 
property  which  is  true  of  the  object  on  input  to  some  property  true  upon  exit  For 
example,  we  might  want  to  increment  by  1  some  count  field  of  a  particular 
dafa-structurc.  We  would  then  say  that  the  count  field  in  the  output  situation  is  1 
plus  the  count  field  in  the  input  situatioa  This,  of  course,  involves  a  reference 
expression;  the  same  defaulting  rules  apply  here,  if  the  reference  expression  mentions 


Dependency  Directed  Reasoning 


11.1  Specifying  Side-effects  205 


no  output  port  it  is  defaulted  to  the  input  situation. 


(dtftptci  bmp 
( input  I  rpcord ) 

(t»p«ct  ) 

(outputs  ( tho-n#w-r*cord  id-to  l-rttari)) 

( dtidrt 

(iid«  tffect  •  record 

(count  tbo-now-rocord  [p'ui  1  [count  •• record]] ))) ) 


Assuming  that  the  segment  is  applied  to  «tco«o-i  in  s-i  yielding  s-i  this  is  asserted  as: 


(»id«-»rr*ct  record!  S  (  SI 

(count  record  I  ([plui  1  ([count  o-rocord]  S-I)  ]  S-I))) 

Notice  that  in  making  the  defaults  a  reference  expression  which  is  resolved  in  the 
input  situation  is  regarded  as  an  input  name,  while  one  resolved  in  the  output 
situation  is  regarded  as  an  output  name  and  will  force  any  enclosing  expression  to  be 
regarded  as  an  output  expression.  This  is  not  always  convenient  but  it  can  always  be 
over-ridden  by  use  of  explicit  object-state  descriptions  or  fully  spelled  out  reference 
expressions.  For  example,  the  LISP  function  Nimm  may  be  described  as  follow* 


nrtvini 

( input i  •  lilt) 

(•■,<*ct  (objtct  t/p«  d-lllt  Hit)) 

(Butp.'l  lilt) 

(dtttrt  ((Idlt-CtM  •  •  1 1 1 1  ntuliil)  •btfor**) 

(rtvtri*  Cd-lllt  *bft or f • i  (n««-tl|t  *«ft«r*>))) 


where,  as  mentioned  in  Chapter  5,  •most*  and  *mhr*  are  special  symbols  provided  by 
the  symbolic  evaluator  to  stand  for  the  input  and  output  situations  of  the  segment 
These  specs  state  that  (l)  the  output  list  was  the  last  cell  of  the  input  list  at  the  time 
of  invocation  of  the  segment  and  (2)  that  the  output  list  at  the  time  of  output  has  a 
structure  which  is  the  reverse  of  that  possessed  by  the  input  list  at  the  time  of  input 


For  Complex  Program  Understanding 


206 


Reasoning  About  Side-effects 


It  should  be  noted  that  although  the  siot-emct  and  «v  clauses  are  special  in  the 
sense  that  they  are  explicitly  trans-situational  assertions,  they  are  otherwise  normal. 
In  particular,  sidi  ihict  clauses  can  be  part  of  a  quantified  statement  For  example, 
we  might  want  to  say  that  certain  members  of  a  set  (those  having  a  particular  key) 
were  side  effected  to  turn  on  a  particular  h*«k.  (This  is  done  in  one  version  of  a 
r*st- iNTiastctiON  routine).  This  can  be  expressed  as  follows: 


(d«fip«Cl  nark-ion* 

(  mputl  thf-tllt  th«k «y) 

(a>p*ct  ( objtc t - type  lilt) 

(  obJ»c t - typ«  th«-k«y  key)) 

(outputi  (tki-nw-lut  id-lo  th«-tut)) 

(•»*•«•» 

(rof  »ll  (  nenbar)  (n «*b«r  tfi«  new- lilt  n«nbtr) 

(mpltti  (key  nanbtr  tht-kty) 

(lidt-irftct  nanbtr  (n»rk»d  ntnbtr)))))) 


This  quantified  statement  is  treated  in  exactly  the  same  manner  as  other  statements;  it 
is  asserted  after  the  objects  have  been  substituted  for  the  input  and  output  port 
names.  Asserting  a  »o«  *u  statement,  as  we  have  seen  in  Chapter  4,  creates  a  rule 
which  triggers  if  its  pattern  is  matched.  This,  in  turn  will  create  a  rule  for  the 
inputs  statement,  which  triggers  if  its  pattern  is  matched.  If  both  these  patterns  are 
matched,  (which  is  equivalent  to  saying  that  we  know  about  a  particular  object  which 
both  is  a  member  of  the  list  and  has  the  appropriate  key)  then  the  system  will 
conclude  that  a  side-effect  definitely  took  place,  namely  that  this  member  of  the  list 
was  marked 


Section  11.2:  Reasoning  About  Simple  Side  Effeots 

Reasoning  about  side  effects  is  conceptualized  by  thinking  of  the  segment  as 
forming  a  transition  between  its  input  and  output  situations.  Side-effect  processing 
consists  of  deducing  which  properties  can  be  moved  across  this  transition  safely;  I, 
therefore,  refer  to  this  process  as  transition  analysis.  My  general  approach  of  explicitly 
recording  dependencies  suggests  that  REASON  should  provide  a  justification  whenever 
it  decides  to  move  a  fact  across  a  transition  Similarly,  if  it  decides  not  to  move  a 
fact  it  should  justify  this  decision  as  wcIL  These  recorded  dependencies  allow 
REASON  to  make  an  initial  decision  based  on  a  cursory  analysis  of  the  circumstances, 
while  still  reserving  for  itself  the  option  of  reconsidering  in  more  detail  at  a  later 


Dependency  Directed  Reasoning 


11.2  Reasoning  About  Simple  Side  Effects  207 


time. 


The  basic  protocol  is  as  follows:  An  assertion  in  the  input  situation  of  a 
transition  can  lie  moved  to  the  output  situation  of  the  transition  only  if  there  is  an 
explicit  assertion  declaring  it  safe  to  do  so.  In  phase  one  of  the  analysis,  simple  rules 
are  run  to  find  reasons  for  not  moving  a  fact  If  such  cause  is  found,  these  rules 
assert  that  it  is  unsafe  to  move  the  fact  At  the  end  of  this  process  each  assertion  is 
asserted  to  be  safe  to  move;  however,  the  justification  for  this  safe  assertion  is 
non-tnonotonic,  depending  on  the  owmess  of  the  corresponding  unsafe  assertion.  Thus, 
if  any  reason  for  considering  the  assertion  unsafe  had  been  found,  the  unsafe  assertion 
will  be  in,  causing  the  safe  assertion  to  be  out.  If  at  the  end  of  this  process  a  safe 
assertion  is  in  for  a  particular  fact,  then  the  fact  will  be  asserted  in  the  output 
situation  with  its  justification  pointing  to  the  safe  assertioa  The  following  rules  carry 
out  these  operations; 


For  Complex  Program  Understanding 


208  Reasoning  About  Side-effects 


Tha  DifiuM  tiiiMr  for  tha  Cartful  Protocol 

( rula  ( (f  (tida-affact  object  out-lit  neu-fact)) 

(  g  (  old-fact  :1n-ilt)) 

(  h  (tranution  :m-nt  out-lit))) 

(aiiim  ‘(not  (laf*  g  in-nt  out-lit)) 

’  ( laf  aty-f  irit  f  g  h))) 

Tha  Oafault  Auiaaar  far  tha  fait  and  Dirty  Protocol 

(rutt  ((  f  (nde  effect  object  :in-*H  out-lit  :na»-fact)) 

(  g  (  old- fact  ;  in- 1 1 1 ) ) 

(  h  (tranution  in-|1t  :  out  -lit))) 

(am mo  '(lafa  g  m  ill  :out-lit) 

'( radian  -  abandon  f  g  :h))) 

tha  Safa  fact  Noaar 

(rula  ((  f  (lift  old  fact  : i n - i i t  out-lit)) 

(  old-fact  (  fact  : In- » it)) 

(  g  (tranution  in-nt  out-lit)) 

(Anart  *(  fact  out-iit)  ’ ( nf a- f rom- 1 ida- af fact  :f  g))) 

Halt  Tha  Side  effect  Appear  In  Tha  New  Situation 

(rule  ((  f  (nde-effect  object  ;in-nt  out-lit  :na»-faet))) 

(anert  •(  neu-fact  out-lit) 

'( i ida  off act i -happen  f ) ) ) 

where  the  last  of  these  merely  asserts  that  the  relation  stated  in  the  side-effect 
assertion  is  true  in  the  output  situation;  this  is,  of  course,  independent  of  the  safe 
assertions. 

1  have  not  yet  shown  any  rules  for  determining  what  is  safe  and  what  is  not 
The  simplest  such  rule  is  the  rule  of  direct  negation  which  says  that  a  fact  which  is 
explicitly  negated  by  a  side-effect  is  unsafe: 


Dependency  Directed  Reasoning 


11.2  Reasoning  About  Simple  Side  Effects 


209 


(rule  ((  f  ( I  in*  effe  .  obj  in-|it  out-lit  (not  fact))) 
(  g  (  fict  in- i it ) ) ) 

(Auerl  '(not  (»*f*  g  in  Hi  out-lit)) 

' (direct -negit ion  f  g/)) 


Another  simple  side-effect  rule  concerns  objects  with  parts.  Suppose  that  a 
record  is  modified  to  change  one  of  its  parts  to  a  new  value.  It  follows,  that  the 
assertion  stating  the  old  value  of  the  affected  part  of  the  record  is  not  a  safe 
assertion.  The  following  rule  states  this  fact: 


(rul*  ((  f  (iid*  effect  object  in-nt  out-Ht 

(  p  *  *  t - n  am*  object  nev-**lu*))) 

(  g  (( object  -  type  object  type)  m-nt)) 

(  h  (pert  type  p*rt-n***)) 

(  1  ((  pirl-nim*  object  old-vilue)  1n-nt))) 

(inert  ‘(not  (life  i  in-m  out-nt)) 

"  ( piM- 1  ide-«f  feet  t  g  h  .  i ) ) ) 

A  second  rule  is  that  a  part-replacing  side-effect  cannot  affect  a  part  assertion 
involving  a  different  part-name. 


(rule  ( (  f 1  (iide  effect  object-1  i-l  I -i 

(  nev-pirt - nim*  object  )  n*«-pirt))) 

(  tl  ((  old-pirt-nem*  object-^  old  pert)  :l-l))) 

( cond 

((*q  old  pirt-nnee  n««  pirt-nm)) 

(t 

(inert  (life  tl  i-l  :i-l) 

Idiff  pirl-nde  effect  fl  - f 7 ) ) ) ) ) 

rules  also  exists  for  the  independence  of  indexed-part  assertions  from  part  side-effects 
and  vice  versa. 

The  next  rule  is  for  side-effects  to  indexed-parts.  This  rule  introduces  a  new 
level  of  complexity  due  to  the  presence  of  incomplete  knowledge  Suppose  we  have  an 
object  with  an  indexed  structure  (for  example,  an  abbay,  a  hasm-taiu,  or  a  record 
structure  including  an  abb»y)  and  that  this  object  is  modified  changing  the  part  indexed 
by  iapc*  o.  Also  suppose  that  we  have  an  assertion  saying  what  is  the  part  indexed  by 


For  Complex  Program  Understanding 


210 


Reasoning  About  Side-effects 


INPIX-l. 


S-2 

Then,  the  side-effect  should  make  the  assertion  unsafe  if  the  two  indices  are  identical 
and  leave  it  unaffected  otherwise,  as  expressed  in  the  following  rule: 


( full  ((  f  (n«i  effect  object  in-nt  out-lit 

(  indued -port- non  objtct  new-inde>  ; now- »ol u« ) ) ) 

(  9  ( (  objoc t ■ typo  objoct  typo)  :ln-|1t)) 

(  h  (indued  pert  typ*  Intfoi -port -n«M ) ) 

(  i  ((  : mde>ed-pert-ne*e  object  :p14-indei  :old-*elue)  :1n-itt))) 

(inert  '('f  ((equel  old-mdei  ne«-indei)  input-lit) 
the"  («ot  (life  1  ln-|1t  out-lit)) 
olio  (»efe  :i  m-Ht  out-lit)) 

‘ ( indexed  pert - 1 ide ■ effect  f  g  h  : i ) ) ) 

This  creates  an  if-m*  ust  assertion  whose  justification  points  to  the  statements 
relating  to  the  side  effect.  If  the  premise  of  the  it  thu-usc  (the  equality  of  the  two 
indices)  is  determined  to  he  true  (false),  then  the  Tutu  (ust)  clause  of  the  if-thcm-ciu  is 
asserted.  Its  justification  includes  the  ir-tMta-ust  assertion  and  the  ioum.  assertion  or 
its  negation. 

Notice,  however,  that  it  is  altogether  possible  that  neither  the  premise  of  the 
ir  thin  ust,  nor  its  negation  are  present  in  the  data  base  and  thus,  that  neither  a  sari 
nor  an  uus»»t  assertion  will  be  created.  Later  in  this  chapter  we  will  see  an  example 
where,  due  to  hypothetical  reasoning,  it  is  not  possible  to  know  whether  the  two 
indices  arc  equal  or  not,  since  one  of  them  is  an  anonymous  object,  standing  for  a 
"typical"  index  of  the  array. 


Dependency  Directed  Reasoning 


11.2  Reasoning  About  Simple  Side  Effects  211 


This  h.is  .1  very  important  impact  on  the  protocol  for  side-effect  processing. 
Recall  that  this  process  goes  through  two  passes,  the  first  of  which  is  "fast  and  dirty" 
(corresponding  to  what  most  programmers  would  notice  without  explicitly  considering 
"screw- ball"  cases).  I  he  second  pass  is  more  careful  and  includes  an  examination  of 
oddball  cases  of  aliasing  (like  the  swap  example,  shown  earlier).  The  crucial  point 
here  is  that,  in  the  first  pass  analysis,  we  consider  a  fact  sm  unless  evidence  to  the 
contrary  is  found. 

If  we  have  an  indexed  part  side-effect  as  above,  and  it  cannot  be  determined 
whether  the  two  indices  are  equal,  then  the  rule  shown  above  will  make  no  assertion, 
(lc.  neither  a  safe  not  an  unsafe  assertion).  But  in  the  first  pass  "fast  and  dirty" 
analysis,  this  lack  of  an  unsafe  assertion  will  be  taken  as  grounds  for  assuming  that 
the  assertion  is  safe;  it  will,  therefore,  be  moved  across  the  transition.  This  will  not 
be  logically  incorrect,  since  the  justifications  for  the  assumption  are  explicit  and  can 
be  withdrawn,  it  is  however,  not  a  very  useful  strategy.  Even  for  a  fast  and  dirty 
pass  this  strategy  is  a  little  too  dirty.  It  would  be  more  useful  to  say  that  if  it  is 
possible  that  the  two  indices  are  equal,  then  we  should  not  consider  the  assertion  s*u, 
but  should  rather  do  a  case  analysis,  considering  separately  the  two  possibilities  of 
equality  and  non  equality. 

Since  the  conclusion  that  the  assertion  is  not  s«t  is  based  on  the  possibility  that 
the  indices  are  equal,  we  need  a  wav  of  asserting  that  this  possibility  exists;  this 
possibility  insertion  can  then  be  included  in  the  justification.  Because  the  notion  of 
possibility  is  used  quite  frequently,  1  have  developed  some  syntactic  mechanisms  to 
facilitate  the  use  of  the  concept.  The  starting  point  is  the  observation  that  an 
assertion  is  posable  as  long  as  its  negation  is  out,  of  course,  if  the  assertion  is  in  it  is 
also  possible.  Thus,  the  following  support  structure  captures  the  notion  of  possibility: 


*  i  (Posf'blt  ((iQOtl 

f  t  (Vot  (  ( I  qi.d  |r>d».  •  Indti-I)  SI)) 


l"«««  l  lndo-1)  S -  1 ) ) 

\ 

M  ( ( equal  1)  St) 


For  Complex  Program  Understanding 


212 


Reasoning  About  Side-effects 


This  structure  is  created  by  calling  the  function  is-fossiiu  with  r-t  as  argument 
Calling  is  -possmt  does  not  make  f-i  in  nor  does  it  make  the  negation  of  r-i  out  it 
simply  creates  a  support  structure  which  says  that  if  r-i  is  in  or  if  f-i's  negation,  r-t, 
is  out  then  f-j  should  be  in.  The  result  of  this  is  that  if  f-i  is  possible  and  is-rossisit 
is  called  with  f-i  as  argument  then  the  assertion  r-i  will  be  in. 

Given  this,  we  can  extend  the  rule  for  indexed-part  side-effects  to  be  more 
cautious  by  adding  the  following  to  its  body; 


(  M  poii’bli  ((»qu»l  old-indti  npw-indti)  input-lit)) 

(rult  ((  '  (pomblt  ((tqutl  old-indo  n«u  indti)  ;  input  - 1  it ) ) ) ) 

( Plltrt 

(not  (tor*  (  indiiid  |irt  n«M  objoct  :pld-mdn  old-vdut) 
in-lit  out-lit)) 

(cartful- ind»««d-pdft  . f ) ) ) 

This  says  that  if  it  is  at  all  possible  that  the  indices  are  equal,  then  the  indexed-part 
assertion  should  not  be  declared  s*n.  If  REASON  decides  that  moving  this  assertion 
is  important  it  can  try  backward  chaining  rules  on  the  ir-tnm-usc  assertion  to  create  a 
case  analysis.  In  one  of  these  cases  it  will  assume  that  the  indices  are  not  equal. 
This  will  cause  the  assertion  f-j  (the  inequality  of  the  indices)  to  come  in,  which  will 
cause  the  possibility  assertion  f-j  to  go  out  since  its  only  support  is  f-i.  But  this,  in 
turn,  will  out  the  u«s»»t  assertion  since  it  depended  on  f-j.  Finally,  the  if -THin-ti.se 
part  of  the  rule  shown  earlier  will  trigger,  declaring  the  assertion  to  be  safe. 

In  the  other  half  of  the  ia<e  analysis,  if  REASON  assumes  the  indices  to  be 
equal  the  if- thin  cist  part  of  the  ru'e  will  trigger,  leading  to  the  conclusion  that  the 
assertion  is  not  s»»t.  It  will  still  be  true  that  it  is  possible  for  the  indices  to  be  equal, 
so  r  1  will  stay  in  as  will  the  hot  safe  assertion  derived  from  it  In  this  case, 
REASON  will  have  two  justifications  for  believing  that  the  assertion  is  hot  safe. 

As  a  further  syntactic  convenience,  I  have  added  an  iF-eossiin-THCN-tisi  construct, 
which  is  invoked  as  follows; 


Dependency  Directed  Reasoning 


11.2  Reasoning  About  Simple  Side  Effects  213 


(If  potnbif  (  f  ffct-1) 
thtn  body-1 
tilt  body-2) 


If  f«T -I  is  possible,  then  soov-i  is  executed  in  a  binding  environment  where  :r  is 
bound  to  the  possibility  assertion  for  i-i  If  t*ci  i  is  impossible  (its  negation  is  in), 
then  soot-;  is  executed  in  a  binding  environment  in  which  j  is  bound  to  the  negation 
of  f  -i.  This  is  actually  a  macro  for  the  following: 


(  it  poti iblt  fact- 1 ) 

(run  ((  f  fact-1)) 

body  *  1 ) 

(ruii  ((  f  (not  fact-1))) 
body  *  2 ) 


For  Complex  Program  Understanding 


214 


Reasoning  About  Side-effects 


Section  11.3:  Safe-from  and  Not  Safe~from 

There  is  still  a  difficulty  in  the  rules  as  stated.  Whereas  it  is  possible  for  a 
single  rule  to  determine  that  an  assertion  is  not  s«i  by  looking  at  a  single  side-effect 
assertion,  it  is  not  possible  for  it  to  determine  that  it  is  sure.  It  can  only  determine 
that  the  particular  side-effect  being  examined  doesn’t  affect  the  assertion.  There 
might  be  other  side-effects  on  this  transition,  however,  which  do  affect  it  It  is, 
therefore,  necessary  to  be  more  specific  in  the  notation,  introducing  a  safe  from 
assertion  which  states  that  the  assertion  is  unaffected  by  a  particular  side-effect 
Similarly,  the  negation  of  such  an  assertion  would  state  that  the  particular  side-effect 
does  affect  the  assertion  in  question  Thus  if  we  had; 

f-l  (S'de  irriy-l  i-m  tout  ( |n«»>  .rray-1  '«.«•- 1  objt  ct-f)) 

t  l  ((Indei  irnj  l  indta-t  objtcl-1)  »-m) 

we  could  write: 

r  «  tSbf.  fr*.  r  i  r  t) 

The  s*ft  assertion  originally  used  above  can  only  be  deduced  if  the  old  assertion  is 
s»»t  from  every  side-effect  on  this  transition  To  make  its  assumptions  explicit, 
REASON  first  gathers  up  all  the  side-effects  on  the  transition  and  explicitly  records 
the  assumption  that  these  are  all  the  side-effects 


f-l  (»'d«-»ffbcti-on-tr.niiti»n  »•!  »•/  (  ...  )) 

Also  a  rule  is  created  which  triggers  if  any  other  side-effect  on  this  transition  is 
noticed;  this  rule  will  negate  the  assumption  that  all  the  side-effects  have  been 
considered,  thus  ouring  the  s»u  assertion  and  forcing  a  re-evaluation  of  the  safety  of 
the  assertion  Finally,  a  set  of  conjunctive  goals  is  established  to  show  that  the 
assertion  is  s*n  from  each  side-effect  on  the  transition  If  these  succeed,  the 
old-assertion  is  asserted  to  be  s*f(.  The  sau  assertion  is  given  a  justification  which 
points  to  each  of  the  s»n  tw  assertions  gathered  in  the  conjunctive  goals,  plus  the 
assertion  f  i  above.  This  guarantees  that  if  anything  is  changed  (Le.  if  a  new 


Dependency  Directed  Reasoning 


11.3  Safe- from  and  Not  Safe-from  215 


side-effect  is  added,  or  if  one  is  removed)  a  recalculation  of  the  true  dependencies  will 
be  conducted. 

An  example  of  this  use  of  sAfi-no*  assertions  is  the  following  "fast  and  dirty" 
rule  for  mri  side-effects  which  says  that  a  mm  assertion  of  one  object  is  independent 
of  mm  side-effects  to  another  object: 

(r ute  ((  fl  (sideeffect  object-1  l-tn  i-out 

(  pert -nwM- 1  object-1  :nev-*otue ) ) ) 

(ft  ((pert -none-?  object-?  old-velve)  :*-ln)) 

(  :!J  ((  object-type  objoct-1  typo-1)  :»•!«)) 

(  f*  ((object-type  object-?  type-?)  :$•!«)) 

(  fs  (port  : typo  - 1  por t - none- 1  ) ) 

(ft  (port  typo-?  port  -  none  •? ) ) ) 

( cond 

((equot  objoct-l  object-?}) 

(t  (lliort  *  ( ief e-f  ro*  fl  !?) 

'(diff-obj -pert- tide -off  : f 1  f?  :fj  :f«  ;fS  :f»))))) 

Notice  that  this  is  a  fast  and  dirty  rule,  since  even  if  o*jtct-i  and  orjcct-?  are 
different  object  names,  it  is  still  possible  for  them  to  be  anonymous  objects  which 
might  be  identical.  The  careful  version  of  this  same  rule,  makes  this  possibility 
explicit  by  adding  the  following: 


(lf-Ro»|ible  (  s  (’d  obioct-1  object  ?)) 
t  hen 
( cond 

((equol  port-none-1  pert  -  none -? ) 

(oitort  '("ot  (lofe-froai  :fl  :f?)) 

'(post- id-pert-offect  9  -.fl  f?  :fl  f«  fS  :?•))) 

(t  (oiiort  '(tofo-froM  fl  ft) 

'(diff -obj-port-itdo-eff  :f 1  :f?  :fl  :f«  :fl  :?•))))) 


For  Complex  Program  Understanding 


216 


Reasoning  About  Side-effects 


Seotlon  11.4:  Mr  9  Complioated  Effeots 

So  far  the  analysis  of  side-effects  has  been  quite  simple  considering  only 
assertions  about  pasts  and  indexed -pasts.  These  are  the  most  primitive  notions  in  the 
system  in  the  sense  that  they  are  not  defined  in  terms  of  any  other  programming 
construct.  However,  as  we  saw  in  our  description  of  programming  objects,  a  host  of 
more  complex  notions  has  been  developed  to  allow  programs  to  be  thought  of  in  more 
high  level  terms. 

The  complex  relations  which  are  often  used  in  describing  programs  are  logical 
combination  of  assertions  which  ultimately  depend  on  the  past  structure  of  the  objects 
implementing  the  more  abstract  notion.  For  example,  in  hashing  systems  there  is  a 
notion  of  membership  in  the  table  which  is  always  defined  in  terms  of  membership  in 
one  (or  more)  of  the  memo  pasts  (sockets)  of  the  table.  Similarly,  since  sockets  are 
frequently  implemented  as  lists,  membership  of  an  object  in  the  socket  reduces  to 
whether  the  object  is  the  h»st  part  or  a  hehbek  of  the  sest  part  of  the  list  Thus, 

simple  side-effects  to  the  part  structure  of  an  object  can  result  in  side-effects  to 

derixed  properties  of  the  object.  For  example,  modifying  a  table  to  set  one  socket  to 
the  iwrr  usr  will  ( potentially)  delete  several  members  of  the  table.  The  processes 
handling  side-effects,  therefore,  must  examine  the  way  in  which  facts  in  the  input 

situation  of  a  transition  depend  on  one  another  and  use  this  as  a  guide  to  the 

transition  analysis.  Consider  the  following  fragment  from  a  hash- table -delete  program: 


Dependency  Directed  Reasoning 


11.4  More  Complicated  Effects  217 


Fragment  of  A  Hash  Table  Delete  Routine 

where  the  list. m«u  used  here  works  by  side-effect,  changing  cdr  pointers  so  that 
after  its  completion,  the  list  will  contain  exactly  those  members  of  its  input  list  which 
do  not  have  the  input  key.  Suppose  that  this  fragment  were  part  of  a  larger  plan  and 
that  in  some  previous  situation  of  this  plan  we  had  concluded  that  entry. i  was  a 
member  of  the  table  since  it  was  a  member  of  »ucket-i  which  is  the  bucket  hashed  to 
by  its  key  *ey-i.  Finally,  let  us  suppose  that  entry- i  has  the  same  key  as  that  input  to 
the  current  plan  fragment  Obviously,  REASON  ought  to  conclude  that  ertry-i  is  not 
a  member  of  the  tabu  after  the  oeute  operation  is  performed;  let  us  follow  its 
reasoning  process: 

It  follows  from  the  protocol  outlined  above  that  in  any  transition  involving  no 
side-effects,  all  assertions  are  safe.  Thus,  any  assertion  true  at  entrance  to  this 
fragment  will  cross  the  transitions  for  eetcm-rucket,  reaching  associative -list- delete.  Let 
us  call  the  input  and  output  situations  of  assoc iative-ust-oelete  s-ir  and  s-out 
respectively.  We  have  the  following  facts; 


For  Complex  Program  Understanding 


218 


Reasoning  About  Side-effects 


_ _ _ S-1M 

l-l  (flower  table  - )  entry-1) 

I-?  iL'-v  cntry-|  key-1 ) 

1-3  (Ii.i.Ii  table- 1  key-1  index-1) 

-•1  (burket  tahle-1  index-1  bucket- 1) 
1 -*>  (  Nrrilir i  bucket-1  cntry-1) 


..  .  J°yi _ 3 

t  huckct-1 

? 

AS.n0CIATIVE-LlST-DF.LETE 

I -b  (sub'-effcct  bucket-1 

(not  (member  bucket-1  entry-1))) 

Effect  of  Associative  List  Delete  on  Hash  Table  Member 

Notice  that  the  side-effect  r*  makes  the  assertion  r-s  in  s-m  ueswi  (by  direct 
negation).  We  can  also  use  the  rules  shown  so  far  to  determine  that  r-«  f-j  and  r-i 
are  s»»t  from  the  side-effect  r  s;  since  this  is  the  only  side-effect  on  this  transition, 
these  are  safe  to  move  across  the  transition 

The  membership  assertion  r-i  depends  on  r-i,  f-s,  r-«  and  r-s  since  it  was  derived 
from  the^c  assertions  using  the  relation-definition  rule  for  hash-taiu  membership.  But 
»  s,  one  of  these  facts,  is  made  uhsah  by  the  side-effect  on  this  transition  It  would 
seem  then  that  we  should  follow  the  justification  from  r-s  to  r-i,  concluding  that  since 
one  of  its  supports  has  become  unsah,  r-i  should  also  be  judged  unsafe.  It  would 
correctly  follow  that  there  is  no  reason  to  believe  that  mttv-i  is  still  a  must*  of  the 
tabu  after  the  side-effect,  i.e.  that  the  side-effect  to  r-s  has  caused  a  derived 
side-effect  to  r-i. 

This  suggests  using  the  justifications  to  guide  an  analysis  of  derived  side-effects. 
It  is  my  feeling  that  an  elegant  extension  to  the  TMS  dependency  system  will  make 
this  possible  (Doyle,  McCallcster,  and  Stallman  have  all  suggested  this  idea  in  personal 
communications),  however,  REASON  uses  a  different  method,  which  is  motivated  by 
the  fact  that  the  logical  connection  between  facts  might  not  be  represented  explicitly. 
In  the  example  above  we  assumed  that  we  had  deduced  that  was  a  member  of 

tabu  i,  and  thus  we  already-  had  a  justification  recording  what  facts  this  assertion 
depended  on.  It  was  then  a  simple  matter  to  see  that  the  side-effect  which  deleted 
an  entry  from  one  of  the  table  s  buckets  also  effected  the  membership  assertion. 


Dependency  Directed  Reasoning 


11.4  More  Complicated  Effects  219 


However,  suppose  that  this  membership  assertion  had  not  been  deduced,  but  had 
only  been  told  to  us  (as  an  output  assertion  of  some  other  sub-segment's  specs)  or  that 
it  had  been  assumed.  Then  the  only  justification  for  the  membership  assertion  would 
be  a  dependency  on  the  spec  clause  or  on  the  reason  for  making  the  assumption. 


(  s|>p(  rl.iuso  fii  ,1  \ Mini|i t ion  reason) 


S  -  F  n 


Huckot-tlolctc 

1  - 1 

F-2 

(Monlifi  r.ihlc  Entry- 1) 

- * 

( side-pffect 

(Not  (member  bucket- 1  entry-1))) 

Side  effect  on  Unexpaiuled  Defined  Relation 


Notice  that  in  this  circumstance,  which  is  actually  much  more  typical  than  that  shown 
earlier,  there  is  no  set  of  justifications  linking  the  membership  assertion  r  i  to  the 
assertion  negated  In  the  side-effect  i  ;.  However,  consider  what  would  happen  if  r-i 
were  expanded  into  its  definition.  This  would  produce  exactly  the  facts  r-i,  r-j,  f-« 
and  r-s  which  we  saw  m  the  earlier  example.  As  these  are  asserted,  they  will 
establish  the  logical  connection  between  r  i  and  r-s  that  we  saw  earlier. 

The  key  problem,  therefore,  is  for  REASON  to  identify  those  circumstances  in 
which  it  is  necessary  to  force  this  expansion  of  defined  relations.  Rules  of  the 
following  form  would  at  first  glance  seem  sufficient- 


(Rult  t(:f|  .tabic  antry) 

(  ft  (S<diittict  tabl*  t-ln  t-out 

(Not  (w»bip  buctot  antry)))) 

(  f 3  ( (Object  •  typa  buckat  buckat) 

(onort  (a.pa'Ml  ( ( Ht*b«  r  tabto  antry)  i-m)| 

( tipand-for-tran»ttion-proe*»nng  :fl  ft  : f 3 ) ) ) 

Rules  like  the  one  above  could  be  created  by  analyzing  the  definition  of  the 
relation  involved  yielding  an  expansion  rule  for  each  clause  involved  in  the  definition. 
However,  I  have  approached  the  problem  somewhat  differently.  This  is  discussed  in 
the  next  section. 


For  Complex  Program  Understanding 


220  Reasoning  About  Sidc-cffccts 


Seotion  11.5:  Determining  What’s  Affeoted 

As  I  mentioned  earlier  defined  relations  introduce  a  connection  between  assertions 
which  must  be  analyzed  in  side-effect  processing.  For  example,  let  us  define  list 
membership  in  the  standard  way;  an  object  is  a  mihur  of  a  list  if  it  is  either  the  first 
of  the  list  or  a  of  the  rist  of  the  ust.  A  side-effect  changing  the  first  of  a 

ust  might  change  a  membership  relation  Similarly,  in  the  above  example,  we  saw 
the  connection  between  rucut  parts  of  a  mu  and  membership  in  the  mu. 

When  presented  with  a  relation-definition,  REASON  produces  rules  used  in 
transition  processing. 

Relation  definitions  have  the  form; 

(relation  objl  objZ  )  <•>  (coapovnb-aiprail ion  objl  Ob  JZ  ...  > 


where  the  compound  expression  is  a  combination  formed  from  the  logical  connectives 
Ain,  or,  rot,  for  au,  tm{r£  is,  inputs,  iF-THtR-usi.  The  compound  expression  may  also 
involve  the  use  of  reference  expressions  which,  in  effect,  introduce  new  objects  on  the 
right  hand  side  of  the  definition  -which  are  not  mentioned  in  the  left  hand  side.  For 
example; 


(Nanbtr  Hit  objact)  <•)  (Of  (Firit  lilt  Objact) 

[Rait  Lilt]  Objact)) 

makes  reference  to  the  rist  of  the  ust,  which  is  not  an  object  mentioned  on  the  left 
hand  side.  In  an  expanded  form  we  might  write  this  as: 


(Or  (Firit  Lilt  Objact) 

(AnR  (Rait  lilt  L lit- 1 ) 

( Naabar  ll»t-l  Objact))) 


Dependency  Directed  Reasoning 


11.5  Determining  What's  Affected  221 


where  ust-i  is  a  new  object  introduced  to  resolve  the  reference  expression.  From  this 
we  can  extract  two  forms  of  information;  One  is  a  network  of  potential  d«p*nd«nCy 
assertions,  linking  assertions  to  those  side-effects  which  might  make  them  unsafe.  The 
second  form  of  information  is  a  set  of  REASON  rules  which  assert  safe  and  unsafe 
assertions.  For  example,  from  the  definition  for  usr  membership  we  can  get  the 
following  potential  dependency  assertions; 

(potential  dtpandancy  (memb«r  ;li»t  objtct-1) 

(firtt  :lilt  object-?)) 

(potential  dependency  (member  lift  object-1) 

(reft  li»t-l  lut-r)) 

(  potent ial • dependency  (member  lift  object) 

(not  (member  lltt-1  object))) 

Potential  dependency  assertions  are  the  information  used  to  determine  that  there  might 
be  a  logical  connection  between  a  fact  and  a  side-effect.  These  say  that  if  (1)  There 
is  an  assertion  in  the  input  situation  which  matches  the  first  pattern  and  (2)  There  is 
a  side-effect  on  the  transition  which  matches  the  second  pattern,  then  it  is  possible 
that  the  assertion  is  rendered  unsak  by  the  side-effect.  Notice  that  in  the  case  of 
dependencies  on  non- functional  relations  (such  as  mcnbin)  the  dependency  is  on  the 
negation  of  the  relation.  If  functional  relationships  (such  as  paht  or  inoekeo-pant) 
assertions  .ire  involved,  a  side-effect  asserting  a  new  value  for  the  relation,  such  as 
(Mm  on  im),  implicitly  negates  any  previous  value  of  the  property,  such  as 
(firtt  un  s.r);  for  these  relations  the  dependency  pattern  is  not  negated.  Also  note 
that  we  have  omitted  the  object-type  information  that  goes  with  the  assertions; 

however,  since  these  assertions  are  used  only  to  find  things  which  might  be  affected, 

omitting  the  object  type  information  will  simply  allow  some  assertions  to  be  considered 
even  though  they  are  not  affected.  This  can  do  no  harm,  it  can  only  make  the 
system  overly  cautious. 

The  network  of  potential  dependency  assertions  is  completed  by  using  a 
transitivity  rule  to  reflect  the  fact  that  if  r-i  depends  on  f-i,  which  in  turn  depends 

on  f  j,  and  if  r  i  is  made  unsafe  by  a  side-effect,  then  f?  and,  in  turn,  r-i  also 

become  suspect 


For  Complex  Program  Understanding 


222 


Reasoning  About  Side-effects 


(ru)i  ((  f  (potential  dependency  a  :b|) 

(  g  (potential  dtpenotney  :b  c > ) ) 

(aitart  (potential  tfependenc;  a  c) 

(pd-trani  f  9))) 

The  information  in  the  potential  dependency  assertions  is  used  by  a  rule  which 
monitors  transitions  looking  for  facts  which  might  be  made  unsAte  by  a  side-effect  If 
such  situations  are  noticed,  the  rule  asserts  that  the  fact  is  possibly-unsafc.  Any  fact 
which  is  possibly  uksak  is  expanded. 


(ruta  ((  f  (potential  dependency  a  0)) 

(  9  (aide-effect  object  a-m  i-out  b)) 

(  h  (  a  a  -  m) ) ) 

(Assert  '(possible  (not  (safe-fro*  g  :h))) 

' ( Pd  f  g  b ) ) ) 

(Sole  ((  h  (Possible  (not  (sefefro«  f  9))))) 

(Assert  '((spend  g)  '(pd-espend  h))) 

The  rules  for  developing  the  potential  dependencies  are  as  follows;  If  the  connective  is 
Ann  or  o«  then  build  a  potential  dependency  for  each  clause  of  the  conjunction  or 
disjunction  (impuijpq)  is  logically  equivalent  to  (o«  (not  p)  o)  and  is  handled 
accordingly.  Similarly,  if-ma  ttsf  is  built  from  inputs.  The  quantified  statements 
require  a  brief  explanation  If  we  have 


f  ■ II  ( f  or • el  1  vers  p  q) 

then  two  kinds  of  side-effects  could  make  r-ti  become  not  true.  One  is  a  side-effect 
which  causes  some  object  which  does  not  satisfy  o  to  satisfy  -.p,  creating  a  counter 
example  to  the  universal  quantification.  The  other  is  a  side-effect  to  an  object  which 
currently  satisfies  both  p  and  o  so  as  to  make  it  no  longer  satisfy  Therefore, 
universally  quantified  statements  potentially  depend  on  both  -t  and  :<x  A  similar 
argument  holds  for  existential  quantification 


Dependency  Directed  Reasoning 


11.5  Determining  What's  Affected  223 


Two  points  about  these  rules  for  determining  potential  dependency  should  be 
noted  First,  these  rules  only  signal  the  possibility  that  an  assertion  is  affected  by  a 
side-effect;  it  is  for  other  more  ihorough  rules  to  explore  whether  or  not  the  assertion 
actually  is  sen  or  not  This  allows  a  many  layered  control  structure  in  which  one  set 
of  rules  notices  candidates  for  examination,  and  other  sets  of  rules  chose  to  examine 
these  candidates  at  a  level  of  detail  deemed  appropriate 

The  second  point  to  be  made  here,  is  that  the  potential  dependency  rules  shown 
so  far  are  actual!)  of  the  "fast  and  dirty"  variety.  Remember  that  the  swap  example 
showed  that  different  local  variable  names  might,  in  fact,  name  the  identical  object 
Usually  people  rule  out  this  possibility  of  "aliasing"  to  facilitate  their  analysis. 
However,  to  In*  completely  accurate  one  must  examine  all  possibilities. 

The  careful  version  of  a  rule  for  ust  membership,  for  example,  is: 


(rul*  ( (  f  (»ide  effect  obj  »-in  i -  out  (fir»t  obj  : n«x- f irit ) ) ) 

(  g  (tmm>b*r  obj-/  o'd-fir»t)  »-m))) 

(It  potiibl*  (  h  (id  obj  obj-/)) 

then  (lilirt  (poil’bl*  (not  (itft-fron  f  g))) 

'(poti-»r-»e-c«r*ful  : f  g  .h)))) 

This  requires  that  the  system  have  rules  for  determining  whether  objects  are  identical 
or  not,  and  furthermore  that  it  maintain  this  information  rather  carefully. 
Fortunately,  most  procedures  do  not  involve  a  large  number  of  objects  so  this  task  is 
tractable.  There  are  several  ways  in  which  ihe  system  can  deduce  the  non-identity  of 
objects  (we  have  already  discussed  ways  in  which  it  can  determine  identity'  One  rule 
is  that  if  an  object  is  newly  created  in  a  situation  which  comes  after  a  situation  in 
which  a  second  object  was  known  to  exist  then  the  two  objects  are  not  identical: 


(fill*  ((  t  (  n*»  object)  «-1n  1-Out  f  »C  t  )  ) 

(  g  (occuri  in  objtct-r  (  :fpct-/  »-oth*r))) 
(  h  (  C0">«  I  -  before  I  -  other  *  out ) ) ) 

(exert  '(not  (id  object-1  object-/)) 

' ( diff -dete-of -birth  f  g  h))) 


For  Complex  Program  Understanding 


224  Reasoning  About  Side-effects 


A  second  rule  for  non- identity  uses  the  disjointness  relation  between  types  in  the 
object- type  hierarchy  to  infer  that, two  objects  have  different  types  and  are,  therefore, 
distinct.  Finally,  both  of  these  are  special  cases  of  the  general  rule  that  if  a  property 
holds  of  one  object  but  not  of  the  other  then  those  objects  are  distinct 

Once  it  has  been  determined  that  an  assertion  is  possibly  affected  by  a  side-effect 
it  remains  to  be  determined  whether  the  assertion  is  s/ut  or  unsmi.  A  second  set  of 
rules  is  dot  eloped  from  the  relation-definitions,  by  going  through  the  logical 
connectives  used  in  the  definitions.  For  example,  a  conjunction  in  which  one  conjunct 
has  been  side  effected  can  be  deduced  to  be  unsafe.  However,  a  disjunction  must  be 
analyzed  further.  The  following  rules  conduct  this  analysis  for  list  memberships 

(rut*  ((  (1  (pomblt  (not  (»of»  from  fZ  f))))) 

(  It  (lidt-tfftcl  !i»l  »-in  i-out  (tint  ti»t  obj-l))) 

(  fl  ( (mtiobtr  tut  obj-Z)  »-!«)) 

(  t«  ((bolt  liit  r*it)  l -  in  > ) > 

(it  pottibl*  (  fj  ((not  roll  obj  Z)|  I  •  in ) ) 

thon  ( AtioM  '(Not  (ibft'fr**  f  Z  fl)) 

'(<*j  not  - ibtt  tl  :fZ  :  f  3  I*  :  f  ft ) ) 
olio  (Alltrt  '(loto-fro*  It  t J ) 

•(<Jlbf»  tl  fZ  :f|  :M  :f ft)))) 

If  it  is  possible  that  the  old  n«st  element  of  the  list  occurred  only  in  the  nasT 
position,  then  it  is  possible  that  the  side-effect  of  changing  the  nasi  of  the  list  would 
cause  that  element  to  cease  to  be  a  nowii  of  the  list.  Thus,  a  cautious  strategy 
avoids  moiing  this  fact  oier  the  transition  until  more  information  is  known.  If  it  is 
e\er  learned  that  the  object  was  definitely  a  must*  of  the  «tst  of  the  list  then  the 
assertion  will  lie  declared  swt  by  the  second  clause  of  the  u-possisit  rule.  In  the 
mean  tune,  this  cautious  strategy  prevents  any  defaulting  strategy  of  the  first  pass 
analysis  from  being  too  lax. 

Notice  that  side  effect  rules  such  as  the  one  above  are  triggered  by  the  possibly 
unttife  asurtion,  rather  than  by  the  side-effect  assertion  directly.  (The  eosstsir  uhsafc 
•ssinmons  are  created  by  the  pountui  dipindincy  rules).  This  allows  other  rules  to 
decide  which  assertions  should  be  worked  on  In  a  later  section  we  will  see  a  set  of 
rules  which  rule  out  possible  unsmity,  helping  the  system  to  avoid  useless  work. 


Dependency  Directed  Reasoning 


11.3  Determining  What's  Affected  22S 


Since  wc  want  to  consider  all  side-effects  which  might  arise,  let  us  consider  an 
example  in  which  the  above  side-effect  leads  to  a  derived  side-effcci.  Suppose  that  in 
addition  to  the  assertions  above  about  list  membership,  we  also  had  an  assertion 
stating  that  the  object  deleted  from  the  list  was  a  kehbeb  of  some  nasm-tabli  in  the 
input  situation.  Since  we  have  concluded  that  the  list  membership  assertion  was  hot 
sati,  if  it  is  possible  that  this  list  is  a  sue «t  of  the  hash. table  we  should  conclude  that 
the  hash .  t abl t  membership  assertion  is  unsate  as  welL  To  start  this  process,  however, 
we  must  first  state  that  there  has  been  a  side-effect  to  the  list  so  that  the  potential 
or ixctucr  rules  ma\  trigger.  This  is  done  by  a  simple  rule  which  translates  unsafe 
assertions  into  side  effect  assertions; 

(tut*  ((  f  (Not  (»»f»  Irw  f 2  fl))» 

(  tf  (Sid«  ittKt  obj  1-in  t  out  >•)) 

(  f  S  (  foci  in  nt))) 

(•start  •  (S ido  t f f oc l  obj  i  in  (-out  ((Not  fact)  » - tn ) ) 

‘  ( tf oni • io  f  ) ) ) 

The  effect  of  this  rule  is  that  every  time  an  assertion  is  determined  to  be  unjafi,  it  is 
then  treated  as  a  side  effect  itself,  initiating  a  consideration  of  derived  side-effects.  In 
this  case,  this  will  lead  to  the  triggering  of  the  rule  for  hash  table  membership 

(rulo  ((  f|  (poil'b'e  (not  (»oT«  from  t2  f 3 ) ) ) ) 

(  tl  (»'d»  offoct  lift  i-m  sout  (Not  (K*<»boc  t'lt  obj)))) 

(  f)  (("*««b*r  toblo  object)  »-m)) 

(  Q 1  (<A»y  obj  1)  » • m ) ) 

(  g 2  ((bo»b  toblo  lojcl  tndoa-1)  l-m))) 

(If  Potubio  (  h  ((bucket  toble  indee-l  lilt)  i-ln)) 
then  (Anert  ’(not  (s«f«-fro*  ft  f 3 ) ) 

’(•e  tob  f)  ft  TJ  gl  gt  h)) 

e1»e  (Aued  '(i»le  *ro»  ft  TJ) 

•(tetob  M  ft  fJ  gl  gt  h)))) 


Thus  the  side -effect  propagates  through  the  various  levels  of  definition.  Notice  that 
when  a  definition  involves  reference  expressions  these  are  handled  somewhat  specially. 
If  the  siilc  effect  is  to  a  clause  within  the  definition  which  has  reference  expressions 
inside  it,  (as  does  the  definition  for  membership  in  the  list  which  implements  a  bucket  of 
a  hash- Mm),  then  these  reference  expressions  are  converted  to  patterns  and  moved 
outside  the  impossible  expression.  However,  if  the  side-effect  challenges  a  reference 
expression  nested  inside  other  reference  expressions,  then  the  outer  references  must  also 
be  stated  tn  the  if  possible  construct. 


For  Complex  Program  Understanding 


226 


Reasoning  About  Side-effects 


There  arc  still  other  logical  connectives  to  consider  in  the  building  of  side-effect 
rules.  Universal  quantification  presents  analytic  problems  similar  to  those  of 
disjunction.  Consider  the  definition  of  being  an  alist,  a  list,  all  of  whose  ntnstts  are 
pai«s  whose  an  parts  are  atoms: 


(Ob)tct-lypt  Hit  AHlt) 

'•>  (Tor  *11  (  tl)  (mtmbtr  Hit  tl  ) 

( ob)tct - typt  [loft  tl)  Alta) ) 


As  1  mentioned  above,  two  different  kinds  of  side-effects  can  make  an  assertion  of 
this  form  uksah.  A  side-effect  which  entered  a  new  element  into  the  list  might 

challenge  such  an  assertion  (if  the  *tv  of  the  new  object  isn't  an  atom).  Similarly, 

changing  the  nr  of  one  of  the  existing  elements  could  undo  the  truth  of  the 
quantified  statement  if  the  new  nt  is  not  an  atom.  Thus,  a  side-effect  rule  for 

universally  quantified  statements  must  trigger  in  either  event  and  then  examine 

whether  to  declare  the  statement  onsAft.  The  following  rules  do  this  for  the  above 
definition: 


(rolf  ((  T)  (poti’bl*  (nol  (t*fp-fro«  f2  f  1 )  ) ) ) 

(  11  (ndt- *ff »t t  ob)  t-in  t-out  (Ity  :ob)  kty-2))) 

(  f)  ( (  ob)«t t • t ypt  lilt  §Hit)  lout))) 

(if-potubl*  (  1*  ((«t»btr  Hit  obj)  l-1n)) 
thon  (aiifrt  '(not  (loft-Ho*  11  .TJ)) 

'(upnioTocouttoui  fl  11  :f*  ;fl)) 

(Tit  (Olitrt  ’(tofo-froa  11  TJ) 

'  ( up  *  lot  •  - 1  ft  -.11  :ti  :  f  4 ) ) ) ) 

(ru'o  ((  fl  (pombTt  (not  (iofo-frt*  f|  fl)))) 

(  11  (STdt  (fftet  ob)  i-tn  i  out  (■t«bor  :H»t  ob)))) 

(  fl  ( (  ob  jtc  t  *  t  ypt  Tilt  tlllt)  tin)) 

(  1*  ((tty  ob)  tty)  I ■ tn ) ) ) 

( »f  poiiTblt  (  fS  ((not  (ob)tct  typo  tty  otoa))  :»-Tn)) 
thtn  (Antrt  '(Hot  (lOft  froa  11  fl)) 

'(up-net  iift-coutToui  M  11  .11  :14  :fS)) 

(Tit  (  At  it  H  ’(lOft-frtM  It  .HI 

’ ( up • I »f t • 2  :f 1  f2  :fl  :f«  :TS)))) 

The  above  rules  are  examples  of  the  "fast  and  dirty"  type  in  that  they  do  not 
attempt  to  check  for  the  identity  of  anonymous  objects.  A  second  version  of  theae 
rules  (along  the  lines  shown  earlier  for  the  "second  pass"  rules)  does  the  extra 
checking. 


Dependency  Directed  Reasoning 


11.6  An  Example  227 


Section  11.6:  An  Example 

Let  us  now  look  at  how  REASON  uses  these  rules  to  analyze  a  mash- taslc 
deletion  routine  which  deletes  all  members  of  the  table  with  a  given  my.  One  wants 
to  proxe  three  things  about  this  program:  (1)  The  desired  elements  were  deleted,  (2) 
Nothing  else  was  deleted,  and  (?)  Nothing  extra  was  added.  Each  of  these  is  a 
universally  quantified  statements-- 

(for  all  (  entry)  ((we*ber  Ubte  entry)  m  itt) 

(  if  ((key  entry  key))  in., it) 
then  (tide-effect  ((not  (»e*»ber  teble  entry))  out-lit)))) 

(for  all  (  entry)  ((me'iber  teble  entry)  ,n  »,t) 

(if  ((not  (key  entry  key-l))  in-ilt) 
then  ((member  teble  entry)  out-lit))) 

(for- ell  (  entry)  ((member  teble  entry)  outfit) 

((member  teble  entry)  in-, it)) 

To  prove  ihr  first  of  these  REASON  assumes  that  there  is  an  anonymous  object 
lAttY  a  which  is  a  member  of  the  table  and  whose  key  is  the  given  key  (in  the  input 
situation).  It  then  attempts  to  show  that  the  table  has  been  modified  so  that  this 
entry  is  not  a  member  of  the  table  in  the  output  situation.  The  following  is  a 
complete  plan  diagram  for  the  program  with  accompanying  assertions  which  follow 
from  the  symbolic  evaluation. 


For  Complex  Program  Understanding 


228  Reasoning  About  Side-effects 


As  I  described  above  the  membership  assertion  involving  will  pass  through  each 

of  the  first  transitions,  but  will  be  stopped  by  the  transition  representing  the  action  of 
sucst*  stobi.  The  assertion  will  then  be  expanded  into  its  definition: 


((««*b«r  Tab  1 • - 1  CntryA)  t -in)  <•>  ((k«y  tntfy*  t|y|)  ■  -  In ) 

((hilh  tbblb-l  Kty-A  Indo-A)  s  -  In ) 

((bucktt  Tbb1|-1  lndti-1  bucktt-l)  t-ln) 

((Hbmbtr  bucktt-l  tntry-A)  »-ln) 

The  support  structure  for  the  u«s«t  assertion  associated  with  the  membership  assertion 
is  now  constructed.  This  structure  makes  the  sxterr  of  the  wtatesMte  assertion  depend 
on  the  s»riTY  of  the  iucut  assertion  But  since  this  assertion  is  unsafc,  the  membership 
assertion  is  also  deduced  to  be  However,  the  assertion 


Dependency  Directed  Reasoning 


11.6  An  Example  229 


((H«mb«r  bucket  1  Intry-A)  i-tn) 

does  move  safely  up  to  the  list  oiliu  routine  whose  specs  say  that  it  creates  a  new 
bucket  which  contains  alt  entries  in  sucut-i  except  those  whose  key  is  kt-l  i«t«v-a, 
however,  is  asserted  to  have  my-i  as  its  key;  it  is,  therefore,  not  a  member  of  bucket-?, 
the  output  of  ust  print.  We  have: 


((Not  ("*"«b*r  bucktl  ?  entry-*))  1-6) 

Notice  that  this  assertion  is  s*»i  to  cross  the  transition  representing  the  bucket-store 
side-effect,  as  are  the  hash  and  key  assertions.  Thus  in  s-?,  the  output  situation  of 
bucket  stout  we  have: 


((Not  (member  bucket?  entry-e))  »•?) 

((huh  table  1  key  I  inde»-l)  I-?) 

((key  entry  a  key-1)  »•?) 

((bucket  table  1  inde>-l  bucket-?)  »-?) 

from  which  the  antecedent  inference  rule  corresponding  to  the  relation-definition  for 
hash- table  membership  infers  that  entry- a  is  not  a  member  of  the  table.  Since  this 
inference  tlejiends  directly  on  a  side-effect  at  this  transition  it  is  also  a  side-effect 
We  thus  h  ave: 


(S'<J*  atr*ct  tablf-1  t  in  »-out 

((Not  (*e*b«r  tablt-1  antry-1))  i-out)) 

which  was  the  sub  goal  needed  to  deduce  the  desired  universally  quantified  statement 
So  we  have  shown  that  all  entries  with  the  given  key  are  deleted. 

Now  REASON  has  to  show  that  nothing  was  deleted  which  should  not  have 
been.  Again  it  creates  an  anonymous  object  entry.*  assuming  that  entry-*  is  a  member 
of  the  table  and  that  its  key  is  not  «im.  The  facts  propagate  similarly  to  above; 
however,  when  REASON  tries  to  expand  this  assertion  it  discovers  that  it  does  not 
know  the  key  of  intry  *  (we  only  know  that  its  key  is  not  key-i)  and,  therefore,  that 
we  also  don't  know  the  index  this  key  hashes  to  or  the  bucket  which  is  in  that  slot  of 
the  table.  Anon>mous  objects  are  created  to  stand  in  for  all  of  these. 


For  Complex  Program  Understanding 


230 


Reasoning  About  Sidc-cffccts 


((K»»b»r  table  1  Entry-B )  l-in)  <•>  ((key  Entry*  Kly-B)  »-t«) 

((Hath  tabta-1  Kty-*  Intfa>-I)  i-1a) 

((buckil  Ublt-1  lnd*>-l  Buckat-I)  »-1n) 

((Ha*bar  Buckit-B  Cntry-I)  ••in) 

The  transition  processing  becomes  somewhat  more  complicated  We  have  the 
following  assertions  and  side-effects  involved 


((bucktt  tabla-1  |ndti-B  Buckat-B)  »-tn) 

(S'da  atfact  tabla-1  t-m  i-aut 

((bucket  tabla-1  mde>-l  buckat-7)  tout)) 

The  source  of  the  problem  is  that  REASON  does  not  know  whether  index- 1  and 
index  b  are  equal  or  not  since  index  a  is  an  anonymous  object  Therefore,  REASON 
engages  in  a  case  analysis,  splitting  into  two  cases:  (1)  index- i  equal  to  index-i  and  (2) 
index  i  distinct  from  index-*.  Each  of  these  case  gives  the  desired  results  rather 
directly.  If  the  two  indices  are  distinct  then  the  bucket  assertion  above  is  sue. 
Similarly,  all  the  other  supporting  assertions  are  s*fE,  leading  to  the  result  that  the 
membership  assertion  itself  is  sate.  ie.  in  this  case,  the  membership  of  ent*y-b  in  the 
table  is  unaffected  by  the  changing  of  the  bucket  since  it  is  in  a  different  bucket 

In  the  other  case,  index  i  is  equal  to  inoex-b  and  thus  bucket-i  is  identical  to 
bucket  •.  An  identification  (see  Chapter  4)  of  bucket*  to  *ucket-i  is  performed,  leading 
to  the  conclusion  that  entky-b  is  a  member  of  bucket-i  in  s-in  and,  therefore,  by  the 
specs  of  1 1 st -delete,  E*T»Y  *  is  also  a  member  of  bucket-i  which  is  then  stored  into  the 
table.  As  above,  this  leads  to  the  conclusion  that  entry-*  is  a  member  of  the  table  in 
the  output  situation. 


Dependency  Directed  Reasoning 


11.7  Pseudo  Parallelism  231 


Section  11.7:  Pseudo  Parallelism 

As  I  have  mentioned  plan  diagrams  allow  a  weak  form  of  parallelism.  Although 
1  am  not  at  this  time  interested  in  the  extra  problems  (and  opportunities)  presented  by 
parallel  execution,  I  have  found  this  parallelism  a  convenient  way  of  capturing  some 
generalities  of  sequential  processes  For  example,  in  representing  the  most  general 
form  of  the  binary- mi- trayirsai  plan  fragment,  we  found  parallelism  allowed  us  to 
represent  the  many  possible  traversal  orderings  in  a  single  plan  diagram. 

However,  parallelism  presents  special  difficulties  when  side-effects  are  introduced 
into  the  programming  discipline.  Consider  the  plan  diagram  for  MacLisp's  kriviru; 


Without  a  control-flow  link  ordering  the  execution  of  the  rrlaco  and  the  cot 
segments,  there  is  no  guarantee  that  one  segment  will  execute  before  the  other. 
Indeed,  there  is  no  information  at  all  in  this  diagram  about  the  mutual  ordering  of 
these  two  segments.  Thus,  it  is  necessary  to  regard  them  as  executing  in  parallel  and, 
therefore,  capable  of  destructive  interference. 


For  Complex  Program  Understanding 


232 


Reasoning  About  Side -effects 


The  plan  diagram  formalism  regards  any  data-flow  as  taking  a  finite  amount  of 
time.  In  fact,  since  data  flows  might  be  implemented  by  a  pathway  of  many  segments 
as  m  a  ouiue -*np  process  plan,  the  time  involved  might  be  considerable.  Therefore,  it  is 
also  possible  for  the  mi*cp  to  have  a  destructive  interference  with  the  data-flow  to 

RREVERSI. 

It  follows  that  the  transition  analysis  which  I  have  discussed  so  far  is  too  simple, 
since  it  has  Iteen  conducted  under  the  unstated  assumption  that  plan  diagrams  are 
interpreted  in  a  strictly  sequential  manner.  Under  this  assumption  all  data  flows 
preserve  all  properties  and  the  only  transition  analysis  required  is  at  the  transitions 
representing  segments  with  side-effects.  This  will  now  have  to  be  generalized  to  take 
account  of  the  extra  complexity  posed  by  the  possibility  of  parallelism. 

This  generalization  is  an  absolute  necessity  for  the  plan  based  analysis  used  in  the 
programmer's  apprentice  as  a  whole,  since  its  approach  is  to  develop  a  catalogue  of 
programming  cliche's.  Many  of  these  cliche's,  however,  are  enumerators  such  as  a 
iBAiiiNi.  roi*<H»  (NiMdATioa,  to  which  other  consumer  plans  are  attached.  When  viewed 
from  this  perspective,  there  is  an  inherent  parallelism  between  the  enumerator  and 
consumer  plans.  In  the  case  of  meverse,  this  parallelism  involves  side-effects  which 
must  be  analyzed  correctly. 

A  ir,imiiion  is  redefined  to  be  a  pair  of  situations  which  are  (1)  Connected  by  a 
data  flow  or  a  control-flow  link,  or  (2)  The  input  and  output  situations  of  a  segment 
In  general,  a  fact  which  holds  in  the  earlier  situation  of  a  transition  can  be  moved  to 
the  later  situation  if  (a)  It  is  s*» t  to  move  the  fact  across  the  transition  in  question 
and  (b)  1  here  is  no  other  transition  which  could  execute  during  the  same  time  as  the 
one  in  question  which  would  render  the  fact  ursa»e.  To  move  a  fact  from  one  end  of 
a  data-flow  link  to  the  other,  one  must  first  inspect  whether  there  is  some  segment 
which  can  execute  in  parallel  with  that  data-flow  and  which  has  a  side-effect  which 
threatens  the  fact. 

To  make  this  inspection  easier,  before  performing  the  symbolic  evaluation  of  a 
plan  diagram,  RF.ASON  first  analyzes  the  data-  and  control-flows,  breaking  the 
diagram  up  into  separate  porht.  These  paths  can  then  be  separated  into  sets  of 
parallel  paths.  Two  transitions  one  on  each  of  two  parallel  paths  can  execute  in 
parallel.  Once  this  analxsis  of  the  plan  diagram  into  parallel  paths  is  completed,  the 
transition  analysis  above  can  be  generalized  quite  simply.  Side-effect  rules  are  now 
triggered  by  the  combination  of  three  types  of  facts  (1)  The  existence  of  a  side-effect. 


Dependency  Directed  Reasoning 


11.7  Pseudo  Parallelism  233 


(2)  The  existence  of  a  fact  in  the  earlier  situation  of  a  transition  and  (3)  The 
possibility  that  the  transition  corresponding  to  the  side-effect  is  on  a  path  parallel  to 
the  one  on  which  the  transition  occurs; 


dull  ((  ft  (  fact  :»•!)) 

(  fZ  ( Tr»ni it  ion  »-l  a - e ) ) 

(  f!  (Onpith  : path* 1  i-l  !•(>) 

(  M  ob)  »l  »-4  : now  - f  oc  t ) ) 

(  fS  (Onpoth  Pbth-Z  »-J  I-*)) 

(  16  (Porolltl  potril  patb- 1 ) )) 


•ppropr)»t»  trimitiou  proctmng 

) 

The  actual  analysis  of  the  plan  diagram  into  paths  is  rather  simple  It  begins  by 
identifying  path  joining  segments  and  path  splitting  segments,  it  those  segments  at 
which  two  flows  (data  or  control)  come  together  at  a  single  segment  and  those  at 
which  two  flows  diverge  from  a  single  segment.  Segment  execution  can  only  begin 
when  all  the  inputs  are  present;  thus,  when  two  data  flows  join  at  a  segment  a 
synchronization  point  is  established.  Similarly,  since  no  output  leaves  a  segment  until 
all  the  outputs  are  ready,  a  synchronization  point  is  established  at  segment  output  as 
welL 


Path  Splitting  Segment 


Pa'h  Joining  Segment 
Path  Joining  and  Splitting 


For  Complex  Program  Understanding 


234 


Reasoning  About  Side-effects 


When  a  path  splitting  segment  is  noticed,  two  new  path  names  are  created,  one  for 
each  diverging  flow.  The  splitting  segment  is  declared  to  be  the  head  of  both  of 
these  paths  and  the  paths  are  declared  to  be  parallel.  Similarly,  when  a  path  joining 
segment  is  noticed  it  is  declared  to  be  the  tail  of  both  paths  entering  it,  and  these 
paths  are  declared  parallel.  A  segment  which  is  entered  by  only  a  single  flow,  or  by 
several  flows  each  of  which  originates  at  the  same  segment  is  declared  to  be  on  the 
same  path  as  the  segment  from  which  the  flows  came  This  last  step,  however,  is 
made  to  depend  on  the  absence  of  other  entering  flows,  so  that  if  new  flows  are 
added  to  the  diagram,  new  paths  will  be  recalculated. 

Several  other  rules  are  also  involved  in  the  calculation.  For  example:  Two 
segments  which  are  at  the  terminal  end  of  conditional-control-flow  links  originating 
from  the  same  segment  arc  on  separate  but  non  parallel  paths  A  pair  of  paths  is 
parallel  if  it  consists  of  one  path  internal  to  each  of  two  segments  where  the  two 
enclosing  segments  are  on  parallel  paths. 

Consider  »stvi*sr  again;  a  bug  exists  if  there  is  no  control-flow  ordering  the 
execution  of  the  »pi*co  and  co*  segments  In  the  path  analysis,  these  two  segments  will 
be  analyzed  to  lx*  on  parallel  paths,  this  will  lead  to  the  conclusion  that  there  might 
be  destructive  interference  between  the  »*i*co  and  the  data  flow  to  the  con.  The 
side  effect  rules  conclude  that  it  is  possible  that  the  flow  does  not  preserve  the  mst 
property. 

However,  this  depends  on  the  assumption  that  there  are  no  further  flow  links 
ordering  the  two  segments.  If  the  programmer  should  intervene,  adding  a  control-flow 
link  to  make  the  »n»co  follow  the  execution  of  the  co*,  then  this  assumption  will  be 
violated  and  the  paths  incrementally  recalculated  In  the  new  calculation  of  paths  the 
s*>i»co  will  not  be  on  a  path  parallel  to  the  data-flow  to  the  co*.  But  then,  one  of  the 
facts  supporting  the  u*s*»t  declaration  will  be  out ,  ouring  the  u»s*n  assertion  itself. 


Dependency  Directed  Reasoning 


12  Reducing  Complexity  in  Side  Effect  Analysis  23 5 


Chapter  12:  Reducing  Complexity  in  Side  Effeot  Analysis 

In  describing  d.ita-structures  I  defined  a  notion  of  structuresharing  in  a  recursive- 
structure.  Much  of  the  complexity  in  reasoning  about  side  effects  occurs  in 
recursively  defined  structures  which  share  some  substructure.  Suppose  that  we  know 
of  the  existence  of  two  lists,  and  one  of  these  is  side-effected;  given  what  we  have 
developed  so  far,  we  must  consider  the  possibility  that  this  side-effect  will  change 
some  properties  of  the  second  list  as  welL  However,  if  we  knew  that  the  two 
structures  were  disjoint,  then  this  possibility  would  be  eliminated,  reducing  the 
complexity  considerably.  [Burstall,  1972]  introduces  some  techniques  for  reasoning 
about  side-effects  which  use  this  notion  of  disjointness  to  advantage.  I  will  extend 
that  notion  in  this  chapter. 

Let  us  examine  in  a  bit  more  detail  why  this  is  true.  The  following  is  a  side 
effect  rule  for  list  membership 


(Rule  ((  M  (iirtt  I'lt-I  f-in  i  out  (tint  :ll»t-l  obj-1))) 

(  tl  ((*pi»b*r  Ini  ’  obj  i) 

\  If  pomble  (  fj  ((>ubliit  lul-2  lul-l)  »-in)) 

th«n  (■it*Pt  (not  (ft*  from  fl  tl))  (lllt-M*  :f]  It  :  f  J ) ) 
•  1»»  (illirt  (iili  fro*  fl  tl)  (Mit-ac*  fl  tl  fi)))) 


this  rule  is  derived  from  the  following  definition; 


(Rotation  lilt  Objtct) 

( dt r in 1 1 ion  (M*«b«p  lnt  Obj) 

<•>  (Op  (Mr»t  Hit  Obj) 

(Thfro  it  (  tub)  (Sublut  lilt  Sub) 

sucti-tbot  (Honbop  Sub  Obj))))) 


Suppose  that  we  know  that  ust-i  and  ust  r  do  not  share  any  structure;  we  can 
determine  quite  simply  that  if  is  not  possible  for  the  side-effected  list  to  be  a  sublist 
of  the  other.  This  removes  the  need  to  conduct  a  thorough  investigation  as  outlined 
in  the  preceding  sections. 


I 


For  Complex  Program  Understanding 


236  Reducing  Complexity  in  Side  Effect  Analysis 


I  will  present  in  this  section  a  hierarchy  of  classification  fo;  side-effects, 
properties  and  the  degree  of  sharing  exhibited  by  recursive-structures  Given  that  we 
have  seen  how  lists  and  trees  can  be  defined  as  special  kinds  of  recursive-structures, 
thi*i  classification  will  be  applicable  to  most  of  the  useful  structures  of  LISP 
programming.  The  purpose  of  this  classification  is  to  use  the  level  of  sharing  to  limit 
the  degree  to  which  side-effects  to  one  structure  can  effect  properties  of  the  other. 
Similarly,  the  classification  of  properties  into  levels  isolates  properties  of  a  higher  level 
from  less  powerful  side-effects. 

Actually  sharing  is  not  as  important  as  the  lack  of  it;  disjointness.  I  have 
identified  three  types  of  disjotntness  which  have  some  utility.  I  have  previously 
defined  structure  sharing  as  having  a  node  in  common.  Structurally  disjoint  structures 
are  those  which  do  not  share  structure  For  lists  this  means  that  no  sublist  (the 
transitive  closure  of  cpb)  of  the  two  lists  is  shared. 

Often  we  will  have  non-recursive  structures  such  as  hash-tables  whose  parts  are 
recursive  structures.  For  such  objects,  we  define  structure  sharing  in  the  obvious  way. 
namely  two  objects  share  structure  if  there  is  a  part  of  the  first  and  a  part  of  the 
second  which  share  structure.  Thus,  a  hash-table  and  a  list  share  structure  if  one  of 
the  table's  buckets  shares  structure  with  the  list  They  are  structurally  disjoint  if 
there  is  no  bucket  of  the  table  which  shares  structure  with  the  list 

The  next  type  of  sharing  is  termed  value  sharing.  Recall  that  the  nodes  of  a 
recursive  structure  can  have  other  parts  (called  values)  besides  those  which  represent 
the  immediate  children  of  the  node.  A  list  is  a  recursive-structure  which  has  a  value 
at  each  node  called  the  first  Similarly  some  types  of  binary-trees  have  a  value  at 
each  node  (See  Chapter  10  for  a  review  of  these  notions).  When  there  is  an  object 
which  is  a  value  of  two  recursive  structures,  we  say  that  there  is  value  sharing-, 
conversely,  if  there  is  no  such  object  we  say  that  the  structures  are  value  disjoint. 
Notice  that  if  two  objects  share  structure,  they  then  must  share  values;  since  they 
have  at  least  one  node  in  common,  the  value  of  this  node  is  a  vacui  of  the  two 
structures;  therefore  they  sham  vahhv  It  follows  that  if  two  structures  are  value 
disjoint,  they  are  also  structurally  disjoint. 


Dependency  Directed  Reasoning 


12  Reducing  Complexity  in  Side  Effect  Analysis  237 


Two  objects  are  totally  disjoint  if  (1)  The  objects  are  both  structurally  and  value 
disjoint  and  (2)  All  objects  pointed  to  by  each  node  are  totally  disjoint  For  example, 
two  lists  are  totally  disjoint  if  they  share  no  sublists,  if  the  members  of  the  lists  are 
distinct,  and  if  as  well  the  members  of  the  first  list  are  totally  disjoint  from  the 
members  of  the  second. 

It  follows  that  if  two  recursive  structures  are  totally  disjoint,  then  side-effects  to 
the  one  can  not  effect  the  other.  Unfortunately,  although  total  disjointness  is  not 
completely  rare,  it  is  not  the  most  Common  event  either.  In  particular,  lists  frequently 
have  common  members  such  as  atoms,  and  are,  therefore,  not  totally  disjoint 
However,  structure  sharing  is  also  reasonably  rare.  The  sharing  of  list  structure 
presents  enormous  opportunities  for  powerful  interactions  and  thus  for  bugs;  therefore, 
most  programmers  avoid  sharing  except  in  those  cases  where  the  power  is  actually 
desired.  Most  side  effects  have  very  limited  scope  as  long  as  there  is  no  structure 
sharing. 

To  begin  let  us  classify  side  effects  in  a  manner  similar  to  that  used  for 
structure  shaiing.  We  call  side  effects  strictly  structural  if  they  only  affect  the 
immediate  children  property  of  some  node  of  the  structure,  «pi»co  is  the  simplest  such 
side  effect,  although  cist  i  use  *  t  and  usi  outu,  «ivt«si,  so«t,  nco«c  and  various  other 
built  m  functions  of  MacLisp  also  are  strictly-structural  side-effects  Notice  that  a 
strictly  structural  side  effect  to  one  structure  will  not  affect  any  property  of  another 
object  which  is  structurally  disjoint  from  the  first 

We  may  also  identify  strictly  value  side  effects  such  as  m*c»  which  only  change 
value  parts  of  a  recursive- structure.  If  two  structures  .ire  structurally  disjoint,  then  a 
strictly  value  side  effect  to  one  will  not  affect  properties  of  the  other.  A  structural 
side  effect  is  one  vv  hich  consists  only  of  strictly  structural  and  strictly  value 
side-effects.  A  sort  program  which  works  by  changing  both  c»#  and  coa  pointers  in  a 
list  is  an  example  of  a  structural  side  effect  Again  if  two  objects  are  structurally 
disjoint,  a  structural  side  effect  to  one  will  not  change  any  property  of  the  second. 

Finally,  an  indirect  side  effect  is  one  which  only  changes  properties  of  values  of  a 
recursive  structure.  For  example,  a  marking  graph  traversal  procedure  w’hich  sets  the 
mask  property  of  the  value  of  each  node  is  such  an  indirect  side-effect  procedure.  If 
two  objects  are  value  disjoint,  then  indirect  side-effects  to  one  will  leave  properties  of 
the  other  unchanged. 


For  Complex  Program  Understanding 


2JR  Reducing  Complexity  in  Side  Effect  Analysis 

W>  say  a  structure  is  isolated  it  it  shares  structure  with  no  other  objects.  We 
ma\  divide  this  into  the  types  used  above,  referring  to  structural,  value,  and  total 
isolation.  Most  routines  which  build  new  structures  such  as  ahpcho,  *jf  ust  create 
structural!)  isolated  objects  This  is  important,  since  a  structurally  isolated  object  is 
relam el\  safe  to  side  effect;  often  programs  will  create  a  copy  of  an  object  and 
side  effect  the  copy  as  a  means  of  guaranteeing  that  unwanted  interactions  do  not 
result. 

Finally  we  come  to  a  classification  of  properties  We  can  notice  that  some 
properties  such  as  susnst  or  uwim  only  depend  on  the  recursive  structure  of  the 
object,  and  not  on  the  vaiuis  at  each  node.  We  call  these  strictly  structural  properties. 
More  commonly,  properties  such  as  who,  depend  both  on  the  recursive  structure  and 
on  the  identit)  of  the  various  vauhs  present  at  each  node,  but  not  on  any  property  or 
sub  structure  of  these  values;  these  are  called  value  dependent  properties  Finally, 
there  are  projieriics  which  depend  both  on  the  structure  and  on  mutable  properties  of 
the  objects  present  at  each  node.  (as  opposed  to  who),  for  example,  depends  on 

the  structure  of  the  list  as  well  as  on  the  structure  of  the  objects  pointed  to  by  the 
list;  if  the  list  (a  *  ci  is  a  new*  of  the  list  i  1  then  the  LISP  invocation: 


(.»W«*«r  (A  «  C)  L  -  1  ) 


will  return  a  no*  mi  answer,  namely  the  list  <a  t  c),  which  is  a  member  of  1-1.  If  this 
list  is  then  »nACAd  so  that  its  first  element  is  x,  then  <*tH»t«  •(»  i  o  i-t)  will  return 
mi.  Let  us  call  such  a  property  a  value  indirect  property. 

A  side  effect  at  one  level  can  not  effect  a  property  of  a  lower  level  For 
example,  a  value  side  effect  cannot  affect  a  strictly  structural  property.  An  indirect 
side  effect  cannot  affect  a  structural  property. 

These  observations  can  be  summarized  by  several  simple  rules  of  the  following 

form: 


Dependency  Directed  Reasoning 


12  Reducing  Complexity  in  Side  Effect  Analysis  239 


(rut*  ((  II  (S'dr  tffbct  obj  1  »-m  t  out  ;»•)) 

(  >t  ( structural • Sid#  I ff«Ct  ll)) 

(  fj  (  Structural  ty-dnjoint  obj-1  Ob  J  -  2  ) ) 

(  t*  ((  fict  obj - i  )  : • -  in 1 1 ) 

(Altirt  ( Sif i- from  I  fl  If!) 

( Struc  lun -d<i  joint  fl  It  :f)  fl))) 

(null  ((  fl  (Sido  iffict  obj-l  »-m  i  out  :••)) 

(  It  ( Str uc tur»1  S’di  tf fit t  ii)) 

(■fl  (  St  rue  tur»lly  Hoi  itid  obj-l)) 

(  f«  ((  fict  obj-l)  i • m ) ) ) 

(Hied  (Sifi  from  >  fl  -  fl) 

(Structure  lioWt'O"  fl  f|  fj  .fl))) 

Notice  th.it  since  these  rules  assert  that  a  particular  property  is  safe,  they 
remove  the  need  to  engage  in  the  more  complicated  analysis  shown  in  the  last  chapter. 
A  large  (tercenfage  of  side  effects  have  very  limited  range  of  effect  precisely  because 
there  is  a  strong  limit  on  structure  sharing.  Most  of  the  time  one  of  the  above  rules 
will  fire  and  REASON’S  work  will  lx*  done.  In  some  rare  casis,  the  more  complex 
and  thorough  analysis  will  be  required. 

This  approach  requires  a  classification  of  side  effects  into  the  various  levels  and  a 
similar  classification  of  properties.  Relation-definitions  provide  the  basis  for  these 
classifications.  To  decide  whether  a  property  is  structural  one  need  only  determine 
whether  it  depends  on  the  node  property  of  the  object,  given  that  the  object  can  be 
viewed  as  a  recursive  structure.  Similarly,  if  the  property  depends  on  any  value 
pointer  of  a  node  it  is  a  value  property.  If  it  depends  on  properties  of  the  value 
objects  it  is  an  indirect  property.  For  example,  consider  the  wwii  relation  for  lists: 


lilt  Object)  >'>  (O'  (It'll  lilt  Object) 

(H*»*b»r  (f#||  lilt)  Object)) 


Since  the  property  depends  on  n«st  which  is  the  value  pointer  for  lists,  the  property  is 
a  value  property.  Also  since  it  imolves  a  recursive  definition  involving  the  rest  (which 
is  the  immediate  child  pointer  for  lists)  it  is  a  structural  property,  mut-t*  in  a  mist 
is  a  indirect  property  as  shown  by  its  definition: 


For  Complex  Program  Understanding 


240 


Reducing  Complexity  in  Side  Effect  Analysis 


*li»t  Ktjr  Value ) 

(thara  n  (  al)  (He«fcer  Alut  al  ) 

>uch- that  (And  (aay  al  *ay)(V alua  al  Value))) 


Since  this  definition  depends  on  hih*u  which  is  both  a  structural  and  value  property, 
iH»r,i  i*  is  also  a  structural  and  value  property.  However,  in  addition  it  depends  on 
the  ti»  and  v*iui  parts  of  Ht«*t»s,  which  are  values;  therefore,  it  is  an  indirect 
property.  Side  effects  may  be  categorized  using  a  similar  analysis  of  defined  relations. 


Dependency  Directed  Reasoning 


13  Reasoning  About  Program  Modifications  241 


Chapter  13:  Reasoning  About  Program  Modifications 

In  the  previous  chapters  I  have  shown  how  REASON  analyzes  a  program, 
maintaining  an  explicit  representation  of  all  logical  dependencies.  Although  such  an 
explicit  representation  is  costly  in  terms  of  space  consumption,  I  will  show  in  this 
chapter  how  that  cost  is  repaid  during  the  process  of  program  modification  It  cannot 
be  overemphasized  that  this  concern  has  been  the  driving  force  behind  my  design 
decisions.  The  price  of  software  maintenance  is  the  most  rapidly  escalating  part  of 
computer  costs  and  the  one  with  the  least  likelihood  of  decreasing.  An  expensive  tool 
which  effects  a  ten  percent  reduction  in  software  costs  would  repay  its  cost  quite 
many  fold. 

When  REASON  has  analyzed  a  program  it  has  a  very  rich  knowledge  structure 
annotating  the  program  text.  This  structure  includes  a  complete  record  of  the  proof 
of  all  pro  requisite  and  achieve  goals,  purpose  links  summarizing  the  inter-relationship 
between  the  spec  clauses  of  the  sub-segments  and  the  main  segment,  links  to 
implementation  methods,  defined  relations  and  other  knowledge  about  the  data-objects 
of  the  program,  and  finally  a  recognition  map  connecting  fragments  of  the  program  to 
the  standard  plans  of  the  library.  Such  standard  plans,  in  turn,  are  organized  into 
specialization  hierarchies  m  which,  for  example,  ustixuwxxtioh  is  regarded  as  a 
specialization  of  the  muiw{*atio«i  plan  for  general  recursive-structures. 

NVhen  a  program  modification  is  proposed,  REASON  uses  this  rich  knowledge 
structure  to  discover  how  far  the  effect  of  the  proposed  modification  will  propagate. 
Typically,  there  is  some  decomposition  of  the  program  in  which  the  change  has  effect 
only  within  a  particular  segment’s  boundaries,  leaving  unchanged  most  of  the  logical 
structure  outside.  For  example,  if  the  method  of  representing  a  str  is  changed  from 
ns»s  to  *qq*  s,  then  only  the  mu*t»Atio*i  part  of  the  program  will  be  modified.  By 
looking  at  the  temporal  viewpoint  of  the  program  we  can  regard  the  tNuntftAtiow  as  a 
separate  segment  which  produces  a  temporal-collection  of  the  wnerts  of  the  set.  This 
is  true  m  both  the  new  and  the  old  versions  of  the  program.  Thus,  we  know  that  the 
rest  of  the  program  (the  part  outside  the  i»u"t»ATio*i)  is  not  affected. 

The  goal  in  analyzing  a  program  modification  is  to  be  able  to  use  the  logical 
analysis  of  the  old  program  to  help  understand  the  new  program  As  the  programmer 
edits  the  old  version,  REASON  attempts  to  follow  the  chains  of  dependencies  to  see 
what  requirements  of  the  old  structure  are  no  longer  met  If  there  are  no  such 
broken  chains  then  the  modification  is  merely  an  addition  of  some  new  behavior 


For  Complex  Program  Understanding 


242 


Reasoning  About  Program  Modifications 


which  can  be  analyzed  by  the  mechanisms  of  the  previous  chapters.  We  will  now 
look  at  how  REASON  determines  what  is  affected. 

Most  changes  are  relatively  straightforward  to  analyze.  For  example,  consider 
what  must  be  done  if  a  new  expect  clause  is  added  to  a  segment  First,  the 
dependencies  linking  the  expect  clauses  to  the  segment's  applicable  assertion  must  be 
rebuilt  to  include  the  new  expect  Then  a  proof  of  the  new  expect  clause  must  be 
undertaken.  If  this  succeeds,  the  segment  will  be  declared  applicable  and  no 
interaction  with  the  user  is  required.  If  not,  the  wticMit  assertion  for  the  segment 
will  be  our. 

The  Truth  Maintenance  System  can  be  requested  to  signal  every  time  a  particular 
fact  changes  status  from  in  to  out  or  vice  versa.  As  REASON  evaluates  a  plan 
diagram  it  makes  such  request  for  every  expect  clause  of  a  sub-segment  and  every 
assert  clause  of  the  main  segment.  Also  such  requests  are  made  for  the  amucasu 
assertions  for  each  sub  segment  and  for  the  main  segment  Thus,  when  analyzing  the 
effect  of  adding  a  new  expect,  REASON  is  first  signalled  that  the  sub-segment  is  no 
longer  applicable.  If  the  proof  of  the  new  expect  clause  succeeds,  TMS  signals  that 
the  status  of  the  applicable  assertion  is  now  in.  Thus,  REASON  knows  that 
everything  is  alright.  If  not,  REASON  reports  that  the  segment  is  no  longer 
applicable.  However,  all  segments  which  depend  on  the  modified  segment  will  also 
become  inapplicable;  REASON  collects  these  signals  as  welL  What  to  do  with  this 
information  is  the  province  of  discourse  expertise  not  yet  present  in  the  apprentice 
s>  stein 

Removing  an  axscri  from  a  sub-segment  presents  a  similar  problem,  although 
there  is  a  new  opportunity.  Since  REASON  has  already  built  purpose  links,  it  is  a 
simple  matter  for  it  to  consult  these  before  doing  anything  else.  The  purpose  links 
tell  REASON  that  the  *sst«r  provides  support  for  various  sub-segment  ixaict  clauses 
and  main  segment  *ssnt  clauses  Each  of  these  is  examined  to  see  if  they  have  other 
support  which  is  independent  of  the  clause  being  removed.  If  each  such  clause  has 
independent  support  then  the  change  has  no  effect;  REASON  tells  the  user  that 
everything  is  in  order.  Otherwise,  it  issues  a  warning,  saying  which  dependent 
segments  are  affected.  If  more  information  is  desired,  REASON  follows  through  the 
actual  justifications  to  build  a  trace  of  the  broken  proof. 


Dependency  Directed  Reasoning 


13  Reasoning  About  Program  Modifications  243 


In  the  case  where  one  clause  is  deleted  and  another  is  asserted,  REASON  waits 
until  both  actions  have  been  performed  before  checking  to  see  which  Amicmi 
assertions  have  changed  status.  Otherwise  it  handles  matters  as  above. 

When  a  data-flow  link  is  changed,  the  assignment  of  objects  to  the  segment’s 
input  ports  to  must  be  updated;  those  expect  clauses  which  mentioned  the  affected 
port  must  W  recreated  with  the  correct  objects  substituted  in  The  justifications 
linking  evjHvt  clauses  to  the  segment’s  amiicmu  assertion  must  then  be  rebuilt 
REASON  then  proceeds  as  in  the  case  where  an  expect  clause  is  changed.  Notice 
that  m  all  then'  cases  when  a  part  of  the  plan  diagram  is  removed,  the  TMS  ours  all 
goals  and  conclusions  which  followed  from  the  outed  statement 

When  a  relation  of  an  object  type  is  redefined,  similar  effects  take  place.  The 
rules  corresjvindmg  to  the  relation  definition  not  only  make  a  deduction,  but  as  with 
everything  else  m  REASON,  they  provide  a  justification  for  the  deduction.  In  the 
case  of  defined  relations  the  justification  points  to  the  assertion  in  which  the 
relation  definition  is  stated.  Thus  if  the  relation  ts  changed,  the  old  definition  goes 
our  and  facts  following  from  the  definition  also  lose  support 

More  commonly,  however,  the  programmer  will  not  change  a  relation-definition, 
but  will  rather  create  a  new  object-type  in  which  a  different  definition  appears.  If  an 
object  of  the  old  object  t\pe  was  deduced  to  have  a  particular  property,  then  the 
justification  for  this  will  point  to  the  most  specific  object-type  in  which  the  relation  is 
defined  Tor  example,  the  membership  relation  for  aiists  is  defined  at  the  level  of 
lists;  the  justification,  therefore,  points  to  the  assertion  stating  that  the  object’s  type 
is  ust  and  not  to  the  assertion  staling  that  it  is  a  aust.  In  contrast,  the  i«a6c  relation 
(*  is  the  imam  of  »  m  the  Hist  c - 1  if  there  is  a  pair  whose  left  is  *  and  whose  right  is 
v  and  that  pair  is  a  member  of  i  1)  is  defined  at  the  level  of  ausi,  any  deduction 
arising  from  the  imam  relation  would  dejxmd  on  the  assertion  stating  that  the  object  is 
an  An sr.  Thus,  if  the  programmer  changes  the  type  of  an  object,  all  deductions 
which  dcjTend  on  its  having  an  object-txjw  which  it  no  longer  holds  will  lose  their 
support  and  go  our.  However,  rules  corresponding  to  the  definitions  of  new  relations 
corresponding  to  the  new  type  might  trigger,  bringing  in  new  facts.  Finally,  if  the 
object  is  changed  from  one  sub-type  to  another,  it  is  possible  that  no  important 
relation  is  defined  at  the  more  specific  level  and,  therefore,  nothing  significant  will 
happen. 


For  Complex  Program  Understanding 


I 

244  Reasoning  About  Program  Modifications 

In  general,  then,  the  pattern  of  reasoning  at  this  level  is  quite  clear.  A  plan 
editing  program  is  instructed  by  the  user  to  add,  delete,  or  change  some  feature  of  a 
plan  diagram  or  some  part  of  an  object  description  This  action  may  cause  some  facts 
to  change  status  from  in  to  out  or  vice  versa.  If  an  applicable  assertion  for  either  a 
sub-segment  or  the  main  segment  winds  up  being  out  after  all  effects  have  been 
propagated  by  the  TMS,  then  REASON  wrarns  the  programmer  that  an  error  has  been 
introduced. 


Dependency  Directed  Reasoning 


13.1  Updating  The  Recognition  Map  245 


Section  13.1:  Updating  The  Recognition  Map 

When  the  apprentice  first  analyzes  a  program,  it  builds  a  recognition  map 

explaining  how  the  program  uses  standard  library  plan  fragments  to  achieve  its  goals. 

A  program  modification  will  typically  force  the  system  to  rebuild  this  map  to  reflect 
the  new  situation.  Fortunately,  simple  modifications  to  program  structure  do  not 
affect  the  recognition  map  in  a  drastic  manner.  For  example,  consider  the  following 
program: 

(dafun  Accumulate  Salary  ( 1 1 1 1  ■  of  •  l««p lo/aat ) 

(do  ((I  L lit -of • tmployaal  (cdr  I)) 

(S.»  #)) 

((Hull  t  )  Si*.) 

(Srlq  Son.  (♦  Sun.  (Caar  I))))) 

This  is  a  simple,  unfiltered  summation  pbocba*  The  usr  is  a  list  of  bicobos,  where 

each  Btcopo  has  the  saiaby  in  the  mbst  position  of  the  record.  This  is  diagrammed  as 

follows: 


Unfiltered  Summation  Program 


For  Complex  Program  Understanding 


246 


Reining  About  Program  Modifications 


Lrt  us  consider  various  simple  modifications  which  might  be  made  to  this  program. 
Suppose,  for  example,  that  we  put  in  a  test  so  that  we  summed  only  the  bi-weekly 
employees: 


(d«fun  Accunul«t*  S«l  try  ( l t > t - oT -(«?) oyrtl ) 

(do  ((I  l  it t -  of  taployoti  ((dr  I)) 

(S*.  •)) 

((Null  l)  Suo) 

(cond  ((tq  (codor  I)  'tl-MlUy) 

(Sotq  Sum  (♦  St*  (Coor  I))))))) 

where  the  second  field  of  each  record  is  the  employee-type.  Of  course,  the  effect  of 
this  change  is  to  filter  the  inputs  to  the  sumution  segment.  The  recognition  proposer 
would  suggest  that  the  recognition  of  the  summation  segment  is  still  correct  and  that  the 
recognition  of  the  ust-inuminaton  is  still  correct  (The  recognition  proposer  [Rich,  1977] 
is  outside  the  scope  of  this  thesis).  However,  the  temporal-collection  input  to  the 
summation  segment  is  now  changed;  the  plan  recognizer  suggests  that  the  new  segment  is 
a  m  tin  segment  interposed  between  the  inuminaton  and  the  summation  routine: 


Dependency  Directed  Reasoning 


13.1  Updating  The  Recognition  Map  247 


Filtered  Summation  Plan 

Thus,  the  recognition  of  two  parts  of  the  plan  remains  the  same;  only  the  mua 
section  and  the  temporal-collection  data-flows  change  at  alL  We  can  easily  change 
our  understanding  of  the  overall  effect  of  the  plan  to  reflect  the  addition  of  the  ruu« 
simply  by  re-evaluating  what  collection  of  objects  flow  into  the  swtuoioii  segment  in 
the  new  plan. 

The  basis  for  this  separation  is  Water's  [Waters,  77]  observation  that  recursive 
programs  (including  loops)  can  be  broken  up  into  a  temporal  decomposition  by 
inspection  of  the  pattern  of  data-  and  control-flow  links.  Thus,  when  a  recursive 
program  is  modified,  the  system  checks  to  see  whether  the  clues  used  previously  in 
forming  the  segmentation  are  still  in  If  so,  it  only  tries  to  form  segments  for  the 

new  code.  Although,  my  system  and  Waters’  are  not  yet  interfaced  into  a  unified 

apprentice  system,  the  discipline  of  explicit  recording  of  all  important  control 

information  can  serve  to  make  the  interface  a  matter  of  less  complexity  than  would 


For  Complex  Program  Understanding 


248 


Reasoning  About  Program  Modifications 


otherwise  result. 

Let  us  consider  another  simple  example  showing  how  these  techniques  can  be 
applied.  Suppose  that  we  had  a  program  in  which  the  data  was  stored  as  a  list  and 
that  for  some  reason  the  program  was  modified  to  store  the  data  in  a  binary-tree  with 
a  value  at  each  node.  The  two  programs  are: 

(dqfun  till  virnon  (hit) 

( do  ( ( I  list  ( edr ) ) 

<»>*  •)) 

((noil  I)) 

(con<j  ((cq  (typq-fiqld  (car  I))  'bivtttly) 

(iatq  tiaa  (plut  iua  ( talary-fiald  (car  1)))))))) 

(dafun  trra-vartion  (tr»a) 

( t rtt - «ar> ton- 1  traa  •)) 

(dafun  traa  tartton  I  (traa  iw) 

(cond  ((«q  (typa-ftald  traa)  ’bt-»aalljf) 

(iatq  tua  (plui  »u»  (talary-fiald  traa))))) 

(cond  ((non  tiratnal  traa) 

( iatq  tua 

( traa-aariion-1  (laft  traa) 

(traa-aarnen-1  (right  traa)  ttaa))))) 

Scan) 

The  second  program  can  be  analyzed  as  a  cohrositior  of  a  tree -traversal,  a  filter,  and  a 
st out "’i»i  summation  plan.  Similarly,  the  first  program  can  be  separated  into  a 
i isr  tauMtPAiio*  and  the  same  mih>  and  sioucuai  summation  plaa  In  making  this  coding 
change  many  surface  details  change,  however,  much  of  the  deep  structure  of  the 
program  remains  constant.  As  the  apprentice  analyzes  the  modification,  it  will  have 
to  rebuild  (he  recognition  mapping  since  some  of  the  details  of  the  recognition  have 
changed.  However,  those  details  which  don’t  change  are  represented  as  facts  in  the 
data  base  which  stay  in  throughout  the  whole  process.  Thus,  those  deductions  which 
are  based  on  facts  of  the  analysis  which  are  not  changed  between  the  two  versions 
stay  in  themselves  and  do  not  require  any  further  deductive  effort 

Let  us  look  again  at  an  example  shown  at  the  beginning  of  this  thesis  in  which  a 
hash-table  is  changed  from  a  linked  list  representation  for  the  buckets  to  a  rehashing 
scheme  in  which  the  cells  of  the  array  form  the  data  structure  to  be  searched.  We 
can  easily  see  that  both  the  ust  and  the  set-of-cells  form  on  acyclic  recursive  structure 
of  Mooi-ottm  1.  We  call  such  structures  linear  structures  and  observe  that  arrays  and 


Dependency  Directed  Reasoning 


13.1  Updating  The  Recognition  Map  249 


single  paths  through  a  tree  are  also  linear  structures.  The  plan  library  contains  a 
hierarchy  of  plans  for  traversal  of  recursive-structures.  The  most  general  of  which  is; 


For  linear-structures  there  is  another  more  specialized  plan  which  replaces  the  plan  for 
enumerating  the  immediate-children  by  the  more  simple  plan  immediate  child  which 
fetches  the  unique  immediate  child  of  the  current  node.  Thus,  we  get: 


For  Complex  Program  Understanding 


2 


tern i rial ' 


flR- 


STRUCTURE- 

fRRVERSRL 


Innedi ate-Chi I d 


L  i  near- 

Strycture- 

Trav/er?al 


Jo i n- T  enpor a  I -Col  lections 


Enumerator  Plan  for  Linear  Structures 

Notice  that  this  plan  is  merely  a  generator  of  successive  sub-structures.  In  fact,  it 
includes  m  its  output  the  terminal  element  of  the  linear-structure  which  in  most  case 
[such  as  list  traversal)  is  not  useful.  More  specialized  versions  of  the 
linear  enumeration  plan  filter  out  the  terminal.  We  are  here  concerned  with  a 
particular- type  of  linear-structure,  those  with  a  value  set  of  size  1  (Le.  we  want  there 
to  he  a  unique  first  value  of  each  sub-structure).  Most  often  we  want  to  augment  the 
specialized  linear-enumerator  with  an  operation  to  fetch  the  first  value  of  the 
sub- structure. 


Dependency  Directed  Reasoning 


13.1  Updating  The  Recognition  Map  251 


Enumerator  of  Values  of  A  Linear  Structure 


1  his  plan  can  lx-  specialized  in  many  ways  depending  on  the  nature  of  the 
linear- structure.  However,  the  specialization  is  always  concerned  with  the 
representation  of  the  linear-structure;  ue.  with  the  particular  means  of  fetching  the 
immediate  child  and  the  particular  means  of  fetching  the  first  item  of  the 
sub  structure  enumerated.  In  the  two  cases  we  are  considering  these  details  are  quite 
different.  In  the  case  of  I  ISP  lists,  the  immediate-child  operator  is  co»  and  the  first 
operator  is  c»». 

However,  in  the  case  of  the  •!nasmi»&  scheme,  things  are  a  bit  more  complicated. 
The  sub  structures  being  enumerated  arc  represented  by  a  triple  consisting  of  an 
array,  an  imi t i*i  .  istu*  and  a  cumm- mein.  The  next  sub-structure  is  the  object 
implemented  by  the  triple  consisting  of  the  same  array  and  initial-index  but  with  a 
new  current  index  which  is  the  rehash  of  the  old  current-index.  Special  provision  is 


For  Complex  Program  Understanding 


252 


Reasoning  About  Program  Modifications 


made  for  representing  the  terminal  sub  structure.  For  example,  the  rehash  of  the  last 

index  in  the  sequence  can  be  a  negative  number;  a  triple  with  a  negative  current-index 

is  defined  to  implement  a  terminal.  The  first  item  of  any  sub-structure  is  the  item  of 
the  array  indexed  b)  the  current-index. 

Thus,  the  above  diagram  can  be  further  specialized  for  these  two  designs, 
replacing  the  more  abstract  n»si  and  woiah  cmuo  by  c*«  and  co«  in  one  case  and  by 
ag«m  rncM  and  *imasm  in  the  other.  Similarly,  the  h«mi»ai.-tist  in  the  two  programs 
must  be  different  since  terminals  in  the  two  designs  are  represented  differently. 

Notice  that  m  the  plan  which  uses  «h»sm  we  use  triples  of  objects  to  represent  the 
sub  structures  flowing  between  the  subsegments  These  triples  are  seen  in  the  surface 
plan  as  separate  dataflows  involving  the  array,  the  initial-index,  and  the 

current  index.  Higher  level  recognition  procedures  must  group  these  three  flows  into  a 
single  virtual  flow. 

I  his  analysis  lets  us  see  where  the  modifications  need  to  be  made  to  effect  the 
change  in  representation.  For  one  thing,  since  the  output  of  either  of  these  more 
specialized  enumerators  is  the  same  temporal-collection  of  values,  we  know  that  the 
consumer  parts  of  the  plan  do  not  need  to  change.  Thus,  the  ioc*u t  program  in  either 
version  would  involve  one  or  the  other  specialized  version  of  the  above  plan  coupled  to 
the  following  standard  search  plan  fragment. 


The  Standard  Search  Plan 


Dependency  Directed  Reasoning 


13.1  Updating  The  Recognition  Map  253 


The  ke>  observation,  however,  is  that  the  search  plan  can  be  seen  to  have  no 
dependencies  linking  it  to  the  design  choice  of  the  data-structure  traversed  by  the 
inwittM.  Indeed,  only  the  urhinal-iest,  first  and  iwvouh-chilo  sub-segments  of  the 
(iiuM'RAttcMi  pl.in  ilc|ioiul  on  the  design  choice  When  the  programmer  proposes  to 
change  the  representation  of  the  bucket  from  lists  to  a  different  type  of 
linear  structure,  the  apprentice,  can  immediately  determine  the  extent  of  the  effects. 
The  entire  search  part  of  the  plan  is  safe  and  the  only  part  of  the  enumeration  which  is 
affected  is  the  first  and  Rtst  parts  which  in  the  current  version  are  the  cm  and  cor 
segments  and  the  terminal-test  which  in  the  current  version  is  the  test  tum.  Based  on 
these  observations  the  apprentice  can  describe  in  high-level  terms  what  segments  need 
changing. 

In  a  more  advanced  version  of  the  apprentice  I  expect  there  to  be  at  least 
enough  synthesis  expertise  to  chose  the  correct  specialization  of  linear  - e nuke  rat  ion  for 
traversing  the  rehashed  cm s  data-structure.  The  apprentice  could  use  this  newly 
selected  plan  to  inform  the  programmer  how  to  change  his  program  to  conform  to  the 
new  design.  In  any  event,  the  ability  to  decompose  the  program  into  both  temporal 
and  surface  viewpoints  allows  the  apprentice  to  treat  the  above  modifications  as 
incremental  just  as  would  any  reasonably  skilled  programmer. 


\ 

For  Complex  Program  Understanding 


254  Conclusions 

Chapter  14:  Conclusions 
Seotion  14.1:  Good  Decisions 

REASON'S  design  deviates  from  that  of  standard  verification  systems  such  as 
[Igarashi,  et  a  I,  1973]  in  many  ways.  REASON  is  intended  to  be  a  part  of  an 
interactive  programmer's  apprentice  system  It  must  function  in  a  number  of  different 
contexts  including  interactive  design,  plan  modification,  and  verification  and  it  must 
service  the  needs  of  different  communities  of  programmers  using  a  variety  of 
languages.  These  requirements  led  to  several  novel  design  decisions. 

In  building  the  apprentice,  it  seemed  essential  that  the  system  not  be  primarily 
concerned  with  the  actual  program  text  The  primitives  of  a  programming  language 
are  too  low  level  to  worry  about  during  the  early  (and  probably  later)  stages  of 
design  Rich  and  I  concluded  that  the  system's  formalism  should  be  quite  simple, 
consisting  of  program  segments  connected  by  control  and  data  flow;  these  seemed  to 
he  the  abstract  notions  which  programming  language  primitives  are  intended  to 
achieve.  Also  we  thought  that  this  formalism  would  be  a  convenient  one  in  which  to 
capture  the  teleologk.il  notions  which  constitute  a  plan. 

I  he  plan  diagram  formalism  has,  so  far,  done  what  it  was  intended  to  da 
However  there  are  some  concepts  which  it  can't  handle.  The  formalism  has  no  place 
for  a  procedure  which,  like  an  interpreter,  manipulates  the  representation  of  another 
procedure,  converting  list  structure  into  new  data  and  control  flow  links.  The  notions 
of  data  and  control  pathways  takes  us  almost  to  this  goal,  but  more  work  needs  to  be 
done.  Similarly,  we  currently  have  no  way  of  describing  interrupts  or  synchronization 
primitives.  However,  these  were  beyond  the  scope  of  our  original  goals;  indeed,  the 
kinds  of  programs  which  we  wished  to  attack  are  simply  and  conveniently  described 
by  plan  diagrams. 

A  second  novel  design  choice  was  to  use  a  symbolic  interpreter  rather  than  a 
verification  condition  generator.  The  apprentice  is  intended  to  help  a  programmer 
design  systems;  this  is  a  somewhat  chaotic  process  in  which  a  programmer  might 
change  his  design  frequently,  moving  segments,  changing  data  flow’s,  and  adding  in 
new  segments  to  perforin  tasks  which  had  been  overlooked.  To  do  this  effectively, 
the  programmer  will  need  to  have  "snapshots"  of  the  computation,  so  that  he  can  ask 
whether  a  property  holds  at  a  particular  point  of  the  computation.  REASON'S  use  of 
situations  satisfies  this  need  while  adding  yet  another  advantage.  The  deductive 

Dependency  Directed  Reasoning 


14.1  Good  Decisions  255 


apparatus  in  REASON  is  completely  integrated  with  the  symbolic  interpreter,  allowing 
simplification  and  partial  deductions  to  be  made  as  the  interpretation  of  the  plan 
proceeds. 

A  third  unusual  feature  of  my  approach  is  that  many  low  level  language  features 
simply  do  not  appear  in  RFASO.Vs  plan  diagram  formalism.  Assignment  to  variables 
is  regarded  as  a  means  of  implementing  a  data  flow;  vmiu  loops,  etc  are  all  captured 
by  recursion.  I  have  grown  to  find  this  means  of  expressing  a  program  quite  natural 
and  simple  and  think  that  a  front  end  for  the  system  could  be  engineered  to  make 
plan  diagrams  a  very  natural  vehicle  for  communication  between  the  apprentice  and 
the  programmer. 

One  future  development  would  be  to  use  a  graphics  system  to  allow  the 
programmer  and  the  apprentice  to  communicate  pictorially;  the  system  would  generate 
the  assertions  of  a  plan  diagram  internally.  The  system  would  be  able  to  display 
standard  library  plans,  modifying  and  specifying  them  on  command  from  the 
programmer  Sv stems  could  be  designed  by  cutting  and  pasting  pictorial  plan  diagrams 
on  a  tv  screen.  1  his  would  require  the  programmer  to  learn  the  vocabulary  of  the 
plan  librarv,  but  I  think  this  would  be  advantageous.  The  plan  library  gives  names  to 
the  standard  patterns  of  programming;  if  programmers  began  to  think  in  terms  of  such 
notions,  their  task  would  be  conceptually  simpler.  Scorch,  accumulation,  tree  traversal, 
etc.  are  all  more  powerful  conceptual  terms  than  arc  while,  do,  etc 

Another  unique  feature  of  the  programmer's  apprentice  project  has  been  our 
emphasis  on  plan  recognition  and  the  development  of  a  plan  library.  Much  of  this 
work  is  being  done  in  a  separate  thesis  by  Rich  (Rich,  1977,78),  yet  its  influence  on 
RF.ASON  has  Iveen  considerable.  My  concern  with  the  temporal  viewpoint  and  the 
reasoning  needed  to  support  it  is  motivated  by  the  needs  of  the  plan  library.  This 
development  has  helped  us  develop  a  natural  and  powerful  vocabulary  for  describing 
programs.  Fven  if  the  rest  of  the  project  never  reached  fruition,  the  vocabulary  itself 
is  of  great  v.ilue  and  might  form  the  basis  for  introductory  programming  classes.  A 
student  who  learned  to  think  in  terms  of  standard  pbns,  would  probably  have  a  much 
easier  time  mastering  the  basic  skills  of  program  design. 


For  Complex  Program  Understanding 


256  Conclusions 


Fuming  now  to  iny  work  on  the  reasoning  system  there  are  several  decisions 
which  I  feel  were  positive.  The  use  of  Doyle's  TMS  [Doyle,  1978]  as  an  integrating 
iiifch.inisin  is  well  justified.  The  decision  to  place  a  great  deal  of  emphasis  on  defined 
relations,  allowing  the  user  to  state  these  declaratively  is  another  idea  which  seems 
worth  the  extra  effort  of  having  the  system  translate  these  declarations  into  various 
kinds  of  procedures.  The  work  I  have  begun  on  reducing  the  complexity  of  side 
effect  analysis  seems  a  quite  promising  outgrowth  of  the  general  approach  of 
integrating  the  reasoning  system  with  i  knowledge  base  and  an  epistemology  of 
programming  concepts. 

I  he  main  advance  I  feel  that  has  been  made  in  the  current  version  of  REASON 
is  the  task  agenda  protocol  suggested  by  Doyle  [Doyle,  1 978b).  Although  I  have  not 
vet  fully  developed  the  choice  making  protocols  of  the  system,  I  believe  that  this 
offers  the  only  route  for  building  a  system  with  evolving  capabilities.  Hopefully, 
future  research  will  provide  some  insights  into  how  to  use  this  power  to  advantage. 
Finally,  I  think  the  initial  work  on  modification  reported  here  is  moving  in  the  correct 
direction.  The  use  of  temporal  viewpoint  plans  to  guide  the  system  during 

modification  sessions  is  a  promising  idea 

Although  the  current  version  of  the  system  is  still  being  implemented,  1  did 
succeed  in  getting  the  first  version  to  do  some  fairly  involved  proofs.  In  the  scenario 
of  this  thesis  the  programmer  designs  an  associative  retrieval  system  along  the  lines  of 
Commer's  data  base.  The  first  version  of  REASON  successfully  completed  proofs  of 
all  the  routines  used  m  this  data  base,  including  the  fast  intersect  routine,  the  indexer, 
and  the  pattern  matcher.  In  addition,  it  recorded  the  dependencies  produced  during 
theve  proofs  and  summarized  them  into  purpose  links.  It  was  not  overly  fast, 
however,  and  it  needed  most  of  the  256K  available  on  our  PDP-10  to  complete  the 
longest  of  these  proofs.  Although  this  earlier  system  had  some  success,  it  was  not  the 
system  I  wanted.  The  newer  system  combined  with  the  MIT  LISP  machine  hardware 
will  probably  be  a  far  more  useful  machine 


Dependency  Directed  Reasoning 


14.2  Problems  257 


Section  14.2:  Problems 

1  had  hoped  to  implement  more  of  the  new  system  by  now;  yet  each 

implementation  stands  on  the  shoulders  of  its  predecessor.  My  experiences  in 

implementing  the  earlier  version  of  the  system  are  worth  mentioning.  Originally  I  was 
primarily  concerned  with  the  efficiency  of  the  system  and,  therefore,  rejected  the 
already  existing  general  purpose  problem  solving  language  in  favor  of  building 

mechanisms  carefully  tailored  to  the  special  needs  of  program  analysis.  As  a  result  I 
spent  a  considerable  amount  of  intellectual  effort  on  issues  at  too  low  a  leveL  As 
tune  passed,  it  Ixec.une  increasingly  obvious  that  I  was  re-inventing  the  wheeL 

One  particularly  painful  aspect  of  this  problem  was  the  use  of  an  unduly 
complicated  context  system  (see  (Rich  &  Shrobe,  1 9761).  REASON  builds  a  situation 
tree  representing  the  temporal  behavior  of  the  program;  since  it  had  to  engage  in 
hy pothetic.il  reasoning  as  well,  I  implemented  an  extension  of  the  context  mechanism 

of  Conimer  (McDermott,  1972)  which  allowed  two  dimensions  of  context  This  was 

unfortunate;  the  mechanism  was  awkward  and  caused  quite  a  few  obscure  bugs. 

Second,  and  far  more  significantly,  the  context  mechanism  is  inappropriate  for  my 

purposes.  Howexer,  since  I  was  taking  an  incremental  approach  to  the 

implementation,  I  didn't  realize  this  until  late  in  the  first  implementation  at  which 
point  I  was  forced  to  stick  with  what  I  had. 

There  arc  two  circumstance  in  which  the  context  mechanism  seetns  unusable. 
The  first  of  these  is  plan  modification;  when  a  programmer  modifies  a  plan  by  adding 
or  deleting  a  segment  or  by  changing  a  data  or  control  flow  link  the  succession  of 
situations  ..  changed  In  such  a  circumstance,  the  context  layers  must  be  reorganized, 
often  in  ways  which  are  precluded  hy  artifacts  of  the  context  mechanism.  Similarly, 
transition  analysis  is  made  cumbersome  by  the  context  mechanism  which  automatically 
inoxes  facts  forward  Some  other  process  must  intervene,  erasing  the  fact  in  the 
context  laxrr  at  which  it  ceases  to  be  true.  One  can  only  begin  to  appreciate  the 
hairy  tuning  problems  this  can  cause;  to  fully  appreciate  it,  you  should  code  up  such 
a  system  and  attempt  to  develop  it.  Finally,  there  was  a  representational  problem;  in 
order  to  gi\e  a  justification  for  a  fact  m  a  particular  context  it  was  necessary  to  have 
a  name  for  the  context.  To  understand  this  name,  the  system  had  to  have  a  map  of 
which  names  come  after  which  others.  The  justifications  in  the  old  system,  therefore, 
were  based  on  the  situation  tag  representation  while  the  reasoning  system  used 
contexts  The  use  of  situation  lags,  TMS,  and  explicit  control  assertions  is  a  far 
easier  discipline  to  loe  with. 


For  Complex  Program  Understanding 


258 


Conclusions 


I  ho  current  system,  however,  has  some  irksome  problems  also.  One  of  these  is 
that  it  appears  to  be  even  more  space  consuming  than  the  first  implementation. 
Within  the  next  few  years  the  hardware  revolution  will  make  this  concern  irrelevant; 
in  the  mean  time,  however,  experimentation  is  difficult.  A  more  serious  worry  is  that 
the  current  system's  mechanisms  for  side  effect  analysis,  although  correct,  are  not  as 
natural  as  I  would  like.  The  truth  maintenance  system  ought  to  be  able  to  use  its 
justifications  to  determine  which  facts  should  move  across  a  transition  McCallester 
and  Doyle  (private  communication)  have  both  suggested  ideas  for  this  kind  of  a 
process,  hut  these  have  not  been  incorporated  into  the  current  design 

I  here  are  still  important  forms  of  reasoning  which  are  outside  of  REASON'S 
scope.  Primary  among  these  is  reasoning  about  termination  and  the  closely  related 
concern  of  time  complexity.  Techniques  described  later  in  the  literature  review  such 
as  'ghost  variables"  might  lx*  quite  easily  integrated  into  the  current  system,  but  I 
h.ne  not  yet  examined  this  idea  thoroughly.  Reasoning  about  space  consumption  is 
another  isMie  which  I  ha\e  not  yet  addressed  at  all.  Finally,  many  of  the  powerful 
heuristic  techniques  for  inductive  proofs  used  in  some  other  systems 
(Boyer  &  Moore,  1977)  have  not  yet  been  integrated  into  REASON. 


Dependency  Directed  Reasoning 


14.3  Future  Directions  259 


Seotion  14.3:  Future  Directions 

I  see  this  work  growing  in  two  directions  at  once.  As  I  indicated  in  the 
introduction,  my  work  can  be  viewed  as  a  technical  stepping  stone  for  future  work  on 
self  conscious  systems  such  as  that  proposed  in  (Doyle,  1978b}  A  next  step  in  that 
direction  is  to  build  an  interpreter  for  plan  diagrams.  This  is,  in  principle,  quite 
simple  to  do;  it  can  follow  the  general  pattern  of  the  symbolic  interpreter  of  this 
thesis.  An  interesting  exploration  would  be  to  implement  the  symbolic  interpreter  as  a 
plan  diagram  for  the  actual  interpreter.  Somewhat  short  of  such  fanciful  exploration 
is  to  begin  to  develop  proof  strategies  as  plan  diagrams  which  are  executed  by  the 
interpreter.  This  will  allow  proof  steps  to  be  parts  of  more  macroscopic  actions  within 
which  thev  play  well  defined  roles.  These  roles  can  then  be  categorized  and  used  as 
the  basis  of  lv»th  general  and  domain  specific  strategies.  [Doyle,  1978b]  discusses  these 
ideas  more  fully. 

REASON  seems  to  have  quite  a  bit  of  room  for  development  within  the 
apprentice  system  as  well.  First  of  all,  there  are  numerous  tasks  described  in  this 
thesis  such  as  modification,  and  recognition  which  are  not  yet  integrated  into  the 
system.  More  interesting,  however,  are  some  avenues  of  exploration  which  we  have 
not  vet  dovelofxvl  One  of  these  is  the  use  of  the  reasoning  system  in  more 
unconstrained  recognition  scenarios  than  •  have  presented  here. 

I  would  like  the  apprentice  to  analyze  4  or  5  pages  of  related  LISP  functions 
with  almost  no  human  intervention.  Such  a  task  would  involve  sophisticated  problem 
solving  strategics  drawing  on  the  powers  of  the  reasoning  system.  In  particular,  it 
seems  that  this  kind  of  recognition  involves  a  certain  amount  of  design  expertise.  One 
organization  of  such  a  system  would  have  a  heuristic  recognition  component  use  lower 
level  clues  to  guess  what  high  level  design  underlies  the  code.  This  design  would  then 
be  elaborated  h\  a  program  synthesis  module  (now  being  worked  on  by  Rich 
(Rich,  1978J)  working  cooperatively  with  the  reasoning  system. 

The  next  level  task  I  would  like  to  work  on  is  the  development  of  new  expertise 
within  the  apprentice  system  Currently,  the  system  relies  on  a  body  of  programming 
knowledge.  As  we  now  envision  program  synthesis,  the  apprentice  can  build  a 
program  when  it  knows  plans  appropriate  for  the  synthesis  task.  This  can  go  a  long 
way  if  the  knowledge  base  is  extensive  and  sophisticated  However,  introspection 
suggests  that  there  is  more  to  programming  than  just  pasting  together  what  one 
already  knows.  A  direction  of  research  which  seems  quite  promising  is  to  use  the 


For  Complex  Program  Understanding 


260  Conclusions 


reasoning  system  to  develop  new  plans  through  logical  analysis,  analogical  reasoning, 
etc.  Many  of  these  ideas  arc  being  pursued  by  other  researchers  in  other  context*,  I 
will  indicate  these  in  the  next  section  which  reviews  the  related  literature. 


Dependency  Directed  Reasoning 


15  A  Survey  of  Related  Work  261 


Chapter  15:  A  Survey  of  Related  Work 

(Waldinger  &  Levitt,  1973]  point  out  that  the  first  program  verification 
methodology  w.is  worked  out  by  Von  Neumann  (Von  Neumann,  1963]  and  that 
validation  is  as  old  as  software  itself.  In  1966,  McCarthy  and  Painter 

[McC  arthy  A.  Painter,  1%6]  presented  a  proof  of  correctness  for  a  simple  expression 
compiler.  Foundational  techniques  using  the  notion  of  a  state  vector  (the  vector  of 
current  tallies  of  all  program  variables)  were  presented  in  (McCarthy,  1962a,b,  63 J. 
Indeed,  the  formal  definition  of  Algol  (McCarthy,  1964]  was  influenced  by  a  concern 
for  provability. 

However,  the  modern  interest  in  verification  seems  to  date  back  to  Floyd's 
pioneering  work  (Floyd,  1967]  (independently  (Naur,  1966]  developed  similar  ideas). 
In  this  method,  flowcharts  are  annotated  by  assertions  which  are  believed  to  hold  any 
time  control  passes  through  the  annotated  point  of  the  program.  An  informal 

verification  may  be  constructed  by  dividing  the  flow  chart  into  control  paths,  showing 
that  each  assertion  is  a  logical  consequence  of  the  earlier  assertions  and  the  intervening 
program  stejvs  on  its  path.  The  pairs  of  entrance  and  exit  assertions  state  the  I/O 
properties  of  the  program.  Normally,  the  programmer  need  only  supply  these 
assertions  and  one  assertion,  called  the  invariant,  for  each  loop  This  notion  of 
correctness  is  called  partial  correctness  since  it  does  not  include  a  proof  that  the 
program  terminates.  The  assertions  are  called  inductive  assertions. 

Floyd  also  in  triad  need  a  method  for  proving  termination  of  programs.  This  proof 
is  conducted  separately  from  the  proof  of  partial  correctness  and  involves  constructing 
a  mapping  between  program  variables  and  a  well-founded  set,  i.e.  a  partially  ordered 
set  with  no  infinite  descending  chains  If  a  monotonically  decreasing  mapping  can  be 
constructed  then  the  program  must  terminate.  (Manna,  1969]  formalized  these  results 
showing  that  the  partial  correctness  of  the  program  is  equivalent  to  the  satisfiability  of 

a  statement  in  first  order  logic  and  that  total  correctness  is  equivalent  to  the 

unsatisfiability  of  a  second  statement.  Intuitively,  the  first  statement  says  that  there  is 
a  set  of  inductive  assertions  from  which  a  partial  correctness  proof  can  be  built;  the 
second  statement  says  that  there  is  no  set  of  assertions  which  would  imply  that  the 
program  halts  with  incorrect  values 


For  Complex  Program  Understanding 


262  A  Survey  of  Related  Work 

C.A.  R.  lioare  (Hoare,  1%9]  extended  Floyd’s  work  by  showing  how  it  could  be 
fit  into  a  formal  logical  language.  Hoare  introduced  the  notation  P  {A}  Q  to  mean 
that  if  P  is  true  before  program  A  is  executed,  then  0  will  be  true  after  A's 
execution,  (if  A  terminates).  Hoare  also  presented  several  rules  of  inference  for  this 

system  such  as: 

*  ->  ■.  »  (*)  S.  S  •>  I 
y  <AI  I 

In  later  work,  Hoare  (Hoare  k  Wirth,  1973]  presented  an  axiomatization  of  the 
programming  language  PASCAL  using  this  formalism.  The  primitives  of  the  language 
are  defined  by  partial  correctness  formulae  which,  like  those  above,  state  how  a 
language  construct  will  transform  a  predicate.  For  example,  assignment  to  a  simple 
variable  is  defined  by 


y  <«  <•  ti  y 

i 

This  states  that  if  P  holds  after  x  is  assigned  E,  then  P  with  every  occurrence  of  x 
replaced  by  an  occurrence  of  E  is  true  before  the  assignment  Hoare’s  techniques 
were  taken  further  in  (Dijkstra,  1975,  1976]  where  Hoare's  partial  correctness  formulae 
are  extended  to  Pijkstra's  predicate  transformers.  Where  Hoare  would  write  P  {A}  R, 
Dijkstra  would  write  P  =  wp(A,R),  indicating  that  P  is  the  weakest  predicate  which 
guarantees  both  that  A  terminates  and  that  R  will  hold  afterwards.  Thus,  Dijkstra’s 
predicate  transformers  strengthen  Hoare’s  work  to  deal  with  total  correctness. 
Dijkstra  also  used  his  notions  to  define  a  language  with  limited  non-determinism. 
Another  extension  called  the  intermittent  assertion  method  [Manna  &.  Waldinger,  19761 
also  allows  proofs  of  total  correctness;  it  uses  assertions  which  are  not  invariants  but 
which  must  hold  at  least  once  (hence  the  name  intermittent  assertion).  (Pratt,  1976] 
presents  foundational  work  providing  a  semantic  model  for  these  formalisms  and  a 
logic  in  which  the  methods  of  Floyd,  Hoare,  Dijkstra,  etc.  can  be  compared.  A  host 
of  literature  analyzing  the  theoretical  and  computational  foundations  of  these  methods 
has  appeared  in  recent  years  such  as  (Lipton,  1977]  and  (Jones  &  Muchnick,  1977]  to 
chose  two  at  random. 


Dependency  Directed  Reasoning 


15  A  Survey  of  Related  Work  263 


Floyd's  method  was  developed  for  flow  chart  programs  and  uses  inductive 
techniques  implicitly.  (Manna  &  Pnueli,  1970]  generalized  Floyd's  techniques  to  handle 
recursive  programs  as  well  Inductive  arguments  are  relied  on.  Other  inductive 
techniques  were  also  developed,  including  computational  induction  (Park,  1969}, 
recursion  induction  (McCarthy,  1963]  and  structural  induction  [Burstall,  1969);  sub-goal 
induction,  a  variant  of  the  inductive  assertion  method,  is  presented  in  (Morris  & 
Wegbreit,  1977}  Structural  induction  “inducts"  on  the  depth  of  recursion  of  the 
programs  data  structures;  computational  and  recursion  induction  are  inductions  on  the 
depth  of  function  calling.  Floyd's  inductive  assertion  method  is  an  induction  on  the 
length  of  the  computation  path.  These  methods  are  surveyed  in  (Manna,  Ness  & 
Vuillennn,  1972]  and  also  in  (Reynolds,  &  Yeh,  1976}  (Manna,  1974]  covers  a  wide 
range  of  theoretical  issues  underlying  program  verification  and  related  fields. 

Following  Floyd  and  Hoare's  seminal  pajiers  a  literature  began  to  develop  in 
which  various  hand  proofs  of  program  correctness  were  presented.  These  include 
(Hoare,  1971]  and  (London,  1970a,b,c  1971}  Attempts  to  automate  the  process  soon 
followed.  The  first  of  these  was  (King,  1969}  a  very  fast  verifier  of  limited  power. 
King's  system  was  coded  in  assembly  language  and  had  built  in  several  special  purpose 
features  for  simplifying  expressions  and  for  handling  systems  of  linear  inequalities. 

The  second  system  within  the  Floyd- Hoare  framework  was  PIVOT,  implemented 
by  Peter  Deutsch  [Deutsch,  1973}  Pivot  verified  programs  written  in  a  limited 
Algol-like  language;  it  works  in  a  manner  more  similar  to  my  symbolic  interpreter 
than  do  many  of  the  later  systems.  PIVOT  traversed  the  program  text  in  forward 
order  (i. e  it  started  at  the  beginning  and  moved  towards  the  end)  and  interleaved 
simplification  of  expressions  with  interpretation  of  the  program  text  It  also  used  a 
context  mechanism  to  record  the  values  of  variables  and  the  trutn  value  of  clauses. 
The  context  mechanism  allowed  PIVOT  to  have  an  incremental  view  of  the 
computation's  temporal  progression.  PIVOT  had  a  fixed  sequence  of  deductive 
techniques  which  it  employed  repeatedly.  It  worked  by  refutation,  trying  to  reduce 
the  negation  of  the  goal  to  a  contradicnoa  It  was,  thus,  more  like  a  resolution 
theorem  prover  than  many  of  the  other  verification  systems  since 

Three  further  systems  followed,  inspired  largely  by  the  original  implementation  of 
the  Stanford  Verifier  (Igarasht,  ef.  aL.  ,,>,73}  Igarashi,  London  and  Luckham  reduced 
Hoare's  logical  system  to  a  core  of  n  v hich  were  deduction  complete  (ie  anything 
the  full  set  could  deduce,  the  core  could  as  well)  and  which,  furthermore,  could  be 
used  deterministically.  The  Stanford  group's  set  of  rules  was  chosen  so  that  there 


For  Complex  Program  Understanding 


264 


A  Survey  of  Related  Work 


would  always  he  exactly  one  rule  which  could  be  applied  a!  a  time.  These  rules  were 
used  in  a  backwards  manner  to  create  a  series  of  subgoals  for  the  output  assertioa 
For  example,  if  the  program  were: 

*  <»  i.  .<•  II  o 

there  would  be  exactly  one  rule  whose  consequent  matches  the  expression  within  the 
braces.  For  example: 


Ml  tl  0 

t 


P  <*  S.  «<•  II  0 

These  rules  are  applied  repeatedly  until  the  inside  of  the  braces  contains  an  empty 
program.  An  implication  made  from  the  two  formulae  surrounding  the  empty  braces 
is  handed  to  a  theorem  prover;  if  the  implication  can  be  proven,  the  program  is 
correct.  Originally,  the  Stanford  system  used  a  resolution  theorem  prover  [Allen  & 
I  uck ham,  1970],  but  this  was  replaced  by  an  algebraic  and  logical  reduction  system 
implemented  by  Suzuki  (Suzuki,  1975]  Further  work  on  the  Stanford  verifier  includes 
a  technique  for  proving  termination  [Luckham  &.  Suzuki,  1975]  which  inserts  in  each 
loop  a  "ghost”  variable  to  count  the  number  of  repetitions.  Termination  is  proved  by 
demonstrating  an  upper  bound  for  the  ghost  variable.  [Luckham  &  Suzuki  ,1976] 
extended  the  proof  rules  to  include  more  complex  data  structures  including  records, 
arrays  and  jvtinters.  [Nelson  «fc  Oppen,  1978]  have  added  a  more  powerful  and 
efficient  simplifier  to  the  Stanford  svsteia 

A  second  verification  system  was  started  at  Stanford  Research  Institute  [Elspas, 
I  evitt,  A  Waldingcr,  1973]  which  used  a  verification  condition  generator  similar  to 
that  of  Igarashi,  et.  al.  However,  the  SRI  system  was  built  around  a  natural 
deduction  theorem  prover  of  some  power,  implemented  in  the  language  QA4.  QA4 
has  a  very  powerful  collection  of  data  types  including  sets,  bags,  and  tuples  built  into 
the  language  which  allow  the  theorem  prover  to  ignore  issues  like  the  canonicalization 
of  arithmetic  expressions  QA4  also  provides  contexts  and  backtracking  facilities  for 
hypothetical  reasoning.  (Although  Doyle’s  TMS  makes  these  facilities  unnecessary  in 
REASON,  at  the  time  of  their  incorporation  into  QA4,  they  represented  a  clear  step 
forward  in  theorem  proving  languages).  By  using  QA4,  the  SRI  group  was  able  to 
build  the  theorem  prover  as  a  set  of  small  QA4  procedures,  grouped  into  clusten 


Dependency  Directed  Reasoning 


15  A  Survey  of  Related  Work  265 


The  SRI  system  is  also  rented  on  in  [Waldinger  &  Levitt,  1974} 

A  third  system  was  developed  jointly  between  the  Information  Sciences  Institute 
at  USC'  and  the  Automatic  Theorem  Prover  project  at  the  University  of  Texas  (Good, 
London  A;  Bledsoe,  1975}  This  system  uses  a  modified  version  of  the  Stanford 
verification  condition  generator,  a  powerful  logical  and  algebraic  manipulation  package 
called  RE DUCF  (Hearn,  1971]  and  the  theorem  prover  of  (Bledsoe  &  Bruell,  1973} 
The  theorem  |>ro»er  is  a  natural  deduction  system  with  special  heuristics  for  case 

splitting,  interval  arithmetic  (based  on  the  technique  of  [Bundy,  1973}),  and  range 
splitting  in  quantified  statements  (Bledsoe,  1971}  The  verifier  itself  was  proven 
correct  (by  hand)  in  (Ragland,  1973} 

Two  other  verification  systems  of  note  have  been  developed,  neither  of  these  uses 
the  Floyd  Hoare  framework.  The  Boyer-Mvore  theorem  prover  for  recursive  function 
theory  (Boyer  &  Moore,  197S,77]  uses  structural  induction  rather  than  inductive 
assertions;  it  states  both  the  program  and  its  sp  .ifications  in  Pure  LISP,  using 

symbolic  execution  to  reduce  the  expression.  Their  system  contains  powerful  methods 
for  choosing  the  basis  for  an  induction  and  for  generalizing  sub-goals  into  lemmas 
The  system  has  proved  impressive  theorems  in  recursive  function  theory;  it  has  also 
verified  a  fast  string  searching  algorithm  and  an  arithmetic  simplifier. 

The  other  verification  system,  (Milner,  1972a, b}  (Milner  &  Weyrauch,  1972}  is  a 
proof  checker  for  Scott's  Logic  of  Comp  table  Functions  (LCF)  [Scott,  1972}  The 

strength  of  the  1  CF  system  is  that  it  oj>erates  vuthm  a  powerful  formal  logic  within 

which  it  is  possible  to  reason  about  complex  procedures  which  manipulate  procedures 
.is  objects.  I  op  programs  which  we  would  find  difficult  to  handle  within  our  system 
are  handled  directly  within  LCF'.  The  system  was  implemented  as  a  proof  checker  to 
assist  the  human  priK»f  constructor.  VonHenke  has  done  some  work  on  integrating 
abstract  recursive  structures  lit  the  system  (vonHenke,  1975}  In  later  work,  [Gordon, 
Milner,  et  al.,  1978}  the  system  was  extended  to  facilitate  the  semi-automatic 
generation  of  proofs  and  the  integration  of  new  strategies  and  types. 

Of  the  systems  I  have  mentioned  so  far,  the  one  most  similar  to  REASON  is 
Peter  Deutsch's  PIVOT.  Both  systems  use  symbolic  interpretation  in  a  forward 
direction  and  interleave  simplification  with  evaluation.  The  other  Fioyd-Hoare  systems 
use  vertf'cafion  condition  generators  which  reduce  the  entire  computation  history  to  a 
v.nclr  first  order  implication.  A  symbolic  evaluation  system  quite  similar  to  REASON 
•  -T  Deutsch's  PIVOT  is  described  in  [Hantler  &  King,  1976}  Other  symbolic 


#•  f  <  Program  Understanding 


266 


A  Survey  of  Related  Work 


execution  svstems  have  been  used  in  a  program  testing  (as  opposed  to  verification) 
environment  to  form  symbolic  expressions  for  the  values  of  program  variables.  Typical 
of  these  is  (King,  19761  [Clarke,  1976)  and  (Howden,  1977,78)  [Balzer,  1978)  uses  a 
weak  form  of  symbolic  evaluation  to  fill  in  the  omitted  details  of  an  imprecise 
program  specification  [Yonezawa,  1977)  describes  a  system  quite  similar  to  mine 
which  is  based  on  that  in  (Hewitt  &  Smith,  1975)  However,  his  system  was  never 
implemented. 

REASON  views  programs  more  dynamically  than  do  many  of  these  other 
s*. stems.  During  its  normal  reasoning  it  moves  assertions  backward  and  forward 
through  the  situations,  reflecting  the  dynamic  propagation  of  facts  through  the 
situations  of  the  program  In  addition,  when  used  within  the  recognition  system, 
REASON  takes  an  even  more  dynamic  view,  expanding  the  program's  temporal 
behavior  and  resegmentmg  this  into  logical  units.  This  process  oriented  view  more 
closelx  resembles  the  recent  theoretical  of  Pratt  on  process  logic  [Pratt,  1978)  and  the 
work  of  Pnueli  [Pnueli,  1977)  on  temporal  logics  suitable  for  describing 
non  terminating  programs  like  operating  systems 


Dependency  Directed  Reasoning 


15.1  Newer  Areas  of  Verification  Research  267 


Seotion  15.1:  Newer  Areas  of  Verification  Research 
Synthesizing  Loop  Invariants 

There  .ire  several  areas  of  current  research  on  verification  systems  which  I  would 
like  to  mention  before  going  on  to  program  understanding  research  more  similar  to  my 
own  The  first  of  these  is  the  problem  of  synthesizing  loop  invariants.  The 

Boyer- Moore  system  d»>es  not  need  to  form  loop  invariants,  but  rather  uses  structural 
induction  on  the  data  structures  to  achieve  the  same  effect  It  has  a  number  of 

heuristics  for  choosing  which  data  structure  to  "induct"  on,  and  works  completely 

automatically.  The  Floyd  Hoare  systems,  however,  require  some  other  process,  human 
or  machine,  to  generate  the  inductive  assertions  Although,  (Dijkstra,  1976]  argues 
that  human  programmers  ought  to  generate  loop  assertions  as  the  first  step  of 

designing  their  programs,  many  researchers  have  found  this  cumbersome  and  would 
prefer  to  have  an  automated  loop  invariant  synthesizer. 

Early  research  in  this  area  includes  [Cooper,  1971],  [Elspas,  1974] 
[Elspas,  et.  al. ,  1072]  in  which  recurrence  relations  are  generated  through  use  of  a 
ghost  loop  counter.  (Wegbreit,  1973,74]  introduced  a  number  of  heuristic  techniques 
which  involve  strengthening  and  weakening  of  assertions.  Typically  the  original 
assertion  is  one  known  to  hold  immediately  after  the  loop's  exit  (this  is  easily 
calculated  by  the  normal  VCG  procedures)  or  immediately  before  entrance  to  the  loop. 
Strengthening  heuristics  include  dropping  a  clause  from  a  disjunction  or  adding  one  to 
a  conjunction.  Wegbreit' s  techniques  were  implemented  in  [German,  1974]  and 
re|v>rtcd  on  in  (German  Wegbreit,  1975]  [Katz  &  Manna,  1973,76]  have  developed 
similar  techniques  including  some  for  handling  arrays  and  for  strengthening  a  partially 
correct  loop  invariant.  [Greif  A:  Waldinger,  1974]  have  also  studied  some  techniques 
in  this  area. 

More  recent  work  in  automatic  synthesis  of  invariants  have  included 
(Captain,  1975]  implementation  efforts  such  as  those  of  [Dershowitz  &  Manna,  1977] 
and  (Moricnni,  1974]  (Cousot  &  Halbwachs,  1978]  presents  a  method  for 
automatically  deriving  linear  inequalities  among  the  variables  of  loop;  these  inequalities 
may  be  used  as  invariants.  (Basu  A  Misra,  1976]  have  studied  some  classes  of 
programs  such  as  accumulation  loops  in  which  the  loop  invariant  is  especially  easy  to 
develop  automatically.  Their  work  resembles  to  a  small  degree  (Waters,  1978]  work 
on  loop  analysis,  m  that  both  look  for  standard  patterns  within  a  loop  and  build  a 
description  of  the  loop  based  upon  known  properties  of  these  standard  patterns. 

For  Complex  Program  Understanding 


268  A  Survey  of  Related  Work 

REASON  currently  relies  exclusively  on  Waters*  techniques  plus  more  advanced 
recognition  techniques  being  worked  on  in  [Rich,  1978}.  I  do  not  yet  know  whether 
this  will  be  completely  sufficient,  or  whether  I  will  have  to  include  some  of  the 
heuristic  techniques  mentioned  above.  Probably,  REASON  will  only  use  such 
techniques  .is  a  last  resort  strategy. 

Abstraction  Techniques 

Another  major  area  of  ongoing  research  is  the  development  of  abstraction 
techniques  which  allow  the  program  and  its  proof  to  be  structured  into  layers  of 
smaller  procedures.  Various  approaches  have  been  taken.  (Hoare  &  Wirth,  1973]  in 
their  axioiiiutizutinn  of  PASCAL  include  a  procedure  call  rule  which  is  the  basis  for 
any  form  of  procedural  abstraction  Various  technical  difficulties  with  the  rule  have 
been  discussed  in  [Cartwright  &  Oppcn,  1978]  and  [Guttag,  et  aL,  1977}  However 
the  ability  to  refer  to  a  procedure  by  its  specification  is  only  the  first  step  in  most 
abstraction  techniques. 

Frequently,  abstraction  techniques  have  been  concerned  with  specifying, 
implementing,  and  proving  the  correctness  of  abstract  data  structures.  [Hoare,  1972] 
introduces  a  method  for  proving  the  correctness  of  a  data  structure  implementation, 
using  the  notion  of  an  abstraction  function  to  map  between  the  variable  of  the 
concrete  space  and  the  variables  of  the  abstract  space.  [Parnas,  1972]  develops  a 
method  for  hierarchically  specifying  a  system  in  which  each  level  of  procedure  is  built 
fiom  modules  at  a  lower  level.  Modules  consist  of  two  types  of  procedures:  O 
procedures  which  are  allowed  to  have  side  effects  and  V  procedures  which  cannot 
Thus,  the  values  of  the  various  V  functions  characterize  the  module's  state,  and  the  O 
procedures  can  be  specified  in  terms  of  their  effect  on  the  values  of  the  V  functions. 
I  his  is  quite  similar  to  our  method  of  specifying  side  effects  and  has  been  used  in 
[Robinson  A  Levitt,  1977} 

Some  newer  languages  such  as  Alphard  [Wulf,  et  al,  1976]  and  CLU 
[I  i<.ko\,  et.  al.,  1977]  have  extensive  facilities  for  grouping  procedures  together  into  a 
"data  abstraction"  with  each  procedure  representing  some  of  the  behavioral  capabilities 
of  the  abstract  datum.  The  procedures  of  the  data  abstraction  share  access  to  the 
concrete  representation,  while  procedures  outside  are,  in  general,  denied  such  access. 
Motivated  by  these  languages,  a  specification  technique,  called  data  algebras,  has  been 
developed  in  [Guttag,  1975)  and  [Zilles,  1975}  In  this  technique  the  data  abstraction 
is  specified  by  axioms  containing  equations  interrelating  the  behavior  of  the  functions 


Dependency  Directed  Reasoning 


15.1  Newer  Areas  of  Verification  Research  269 


contained  in  the  data  abstraction.  Verification  consists  of  showing  that  each  module 
in  the  "cluster"  preserves  each  of  the  axioms.  An  inductive  argument  then  shows  that 
the  axioms  are  invariants  of  the  cluster  since  only  the  modules  of  the  cluster  may  act 
on  objects  of  the  abstract  type.  This  approach  is  quite  uifferent  than  those  we  use; 
(Yonezawa,  1077)  presents  an  argument  for  techniques  more  similar  to  ours. 
(Laskov  St  Berzins,  1077]  have  written  a  survey  of  data  specification  techniques. 

Side  Effect  on  Complex  Data  Structures 

A  closely  related  area  of  research  has  dealt  with  the  problems  of  side  effects  on 
complex  and  shared  data  structures.  (Oppen,  1975)  presents  an  axiom  system  for 
reasoning  about  Directed  Graphs  and  derived  a  computability  result  for  this  system. 
(Yellow it/  &  Duncan,  1078)  also  work  with  the  DiGraph  model,  but  develop  a  much 
more  succinct  formalism. 

(Suzuki,  1075)  develops  axioms  for  the  complex  data  structures  allowed  in 
PASCAL  In  this  system  each  data  tjpe  is  made  to  appear  to  be  an  array,  indexed 
by  pointers  of  the  appropriate  data  type.  This  is  possible  since  in  PASCAL  a  pointer 
variable  may  only  reference  objects  of  a  single  type.  Suzuki  requires  a  special 
not. 1 1 ion  for  predicates  which  refers  to  a  recursive  data  type;  they  must  include 
symbols  to  refer  to  all  the  data  types  involved  in  the  definition.  Supposing  one  wished 
to  have  a  predicate  stating  a  well  formedness  criterion  for  lists  of  pairs.  This  would 
have  to  stated  as 


(Wf  II  lUt  0*  P$P%  If) 

where  the  last  two  symbols  represent  the  pseudo-arrays  of  lists  and  pairs.  If  a  side 
effect  is  performed  on  any  list  or  pair,  this  predicate  will  be  updated  by  changing  one 
of  the  last  two  symbols  in  a  manner  analogous  to  the  standard  array  rule.  Although 
logically  sound,  the  system  produces  large  expressions  whose  intuitive  meaning  is,  at 
best,  unclear.  In  large  interrelated  systems,  the  expressions  might  well  become 
intractable.  Suzuki's  technique  can  lie  viewed  as  a  special  case  of  my 
potential-dependency  network,  in  which  a  very  course  filter  is  used  to  decide  if  an 
assertion  is  threatened  by  a  side  effect.  Whereas  REASON  filters  for  specific  types  of 
side  effects  to  a  particular  data  type,  Suzuki’s  system  filters  for  any  side  effect;  thus, 
his  system  will  produce  unduly  complicated  expressions. 


For  Complex  Program  llnderstanding 


lYonezawn,  1977)  uses  a  formalism  which,  except  for  the  use  of  the  TMS,  is 
quite  a  hit  like  mine;  this  is  hardly  accidental  since  we  shared  an  office  for  2  years 
and  both  worked  with  Carl  Hewitt  His  thesis  describes  a  system  which  uses  the 
sirti.irroii.il  calculus  to  reason  about  the  situation  ,1  transformation  brought  about  by  a 
side  effect.  However,  Yonezawa's  later  interests  moved  more  towards  a  formalism  for 
reasoning  about  parallel  procedures  and  synchronization;  his  system  was  never 
implemented. 


Dependency  Directed  Reasoning 


13.2  Apprentice-Like  Systems  271 


Section  15.2:  Apprentioe-Like  Systems 

There  h.is  been  some  other  research  which  includes  one  or  another  of  the 

characteristics  of  the  programmer’s  apprentice  project  1  will  briefly  review  two 

categories  of  th<  e:  The  first  involves  some  attempt  to  support  the  programmer  in  an 
interactive  design  and  evolution  environment;  The  second  involves  some  attempt  to 
catalogue  standard  programming  knowledge. 

Systems  for  K.volut innary  Design 

Most  de\ eloped  among  the  systems  which  support  interactive  design  and 

modification  is  the  system  of  (Moriconi,  1 077 J.  This  system,  called  SID,  is  integrated 
into  the  University  of  Texas/ISI  verification  system  of 
(Good,  London  &  Bledsoe,  1975]  The  new  feature  of  this  system  is  an  interface 
module  which  analyses  proofs  constructed  by  the  theorem  prover  to  determine  the 
dependencies  between  the  verification  conditions.  These  are  then  represented  in  a 

simple  network  The  verification  conditions  are  also  analyzed  so  that  their  dependence 
on  particular  lines  of  code  is  recorded  in  the  data  base.  When  program  modifications 
are  proposer!,  the  network  is  examined  to  see  whether  any  logical  link  is  affected. 
This  is  quite  similar  to  the  apprentice's  purpose  links.  However,  if  some  dependency  is 
affected,  both  the  verification  condition  generator  and  the  theorem  prover  must  be 
called  to  complete!)  recreate  the  proof  for  the  modified  section  of  code.  Thus,  the 
system  is  less  incremental  in  its  analysis  than  is  REASON  which  uses  the  Truth 
Maintenance  System  to  reuse  as  much  of  the  original  reasoning  as  possible. 

Another  project  of  a  similar  nature  is  that  of  (Dershowiu  &  Manna,  1977]  and 
(Katz  &  Manna,  197(»]  In  the  first  of  these,  analogies  between  the  new  and  old 
versions  of  the  program  specs  are  used  to  generate  loop  invariants  for  the  new  code. 
In  the  second,  a  table  of  dependencies  between  loop  invariants  is  maintained  to  aid  in 
the  an.d\ sis  of  modifications.  Both  of  these  system  seem  less  developed  than 
Moriconi' s.  The  Dcrshowitz  and  Manna  paper  has  a  brief  discussion  of  the  use  of 
schemata  to  capture  some  programming  generalities,  but  none  of  these  systems  have 
our  emphasis  on  cataloging  programming  knowledge 


For  Complex  Program  Understanding 


272  A  Survey  of  Related  Work 
Knowledge  Based  Systems 

Several  systems  have,  however,  attempted  to  take  a  knowledge  based  approach. 
(Schwartz,  1977)  proposes  to  build  a  set  of  root  programs  representing  basic 
programming  techniques  and  a  set  of  correctness  preserving  combination  rule*. 
However,  this  process  is  to  take  place  in  a  strictly  hierarchical  manner;  the  work  is 
also  intimately  tied  to  the  language  SETL,  in  contrast  to  our  attempt  to  remain 
language  independent. 

(Gerhart,  197'a)  proposes  to  catalogue  programming  knowledge  in  the  form  of 
schemata  which  serve  as  syntactic  templates.  Programming  knowledge  and  proof  rules 
are  attached  to  each  template,  forming  a  catalogue  of  programming  knowledge.  I  feel 
that  ihis  representation  is  too  low  level  and  language  dependent,  even  though  some 
generality  is  regained  through  use  of  correctness  preserving  transformations 
(Gerhart,  |97Sb)  (Darlington  &  Burstall,  1973)  have  also  studied  the  use  of 
transformations. 

An  early  attempt  to  catalogue  programming  knowledge  is  found  in 
(Ruth,  1973,76a)  Ruth  represented  algorithms  as  grammar’s  for  a  parser  using  an 
AIN-likc  formalism.  Each  grammar  can  parse  several  programs  representing  different 
implementations  (including  some  with  standard  bugs)  of  the  same  algorithm  The 
system  was  developed  to  parse  the  programs  of  beginning  students  and  lacks  several 
features  which  seem  important  in  the  more  complex  domains  which  concern  me.  The 
ATN  like  formalism  seems  overly  syntactic  and  cumbersome  for  the  representation  of 
the  wide  range  of  programming  knowledge  which  I  desire  to  capture.  Also,  since  the 
formalism  has  no  means  of  stating  the  intrinsic  behavior  of  sub-segments,  it  has  no 
ability  to  represent  the  purposeful  nature  of  the  interconnections  between  modules. 
I  ater  work,  [Ruth,  |976H,c)  develops  an  expert  system  for  inventory  data  bases  with 
extensive  knowledge  of  file  and  record  organization. 

The  knowledge  based  system  most  similar  in  approach  to  the  apprentice  system  is 
the  I’M  system  of  (Green,  et.  al,  1976,77)  (in  particular  the  PECOS  sub-system 
developed  in  (Darstow,  77p.  PECOS  is  a  set  of  refinement  rules  for  program  synthesis 
representing  a  broad  range  of  knowledge  about  sets,  mappings,  tuples,  array’s,  etc 
arranged  to  capture  as  much  general  knowledge  as  possible  Although  the  system  has 
some  implementation  and  representational  problems  (it  has  neither  a  deductive  system, 
nor  a  clear  notion  of  data  flow),  it  does  seem  to  capture  a  reasonable  segment  of  the 
knowledge  of  the  expert  programmer  in  a  natural  manner.  Efficiency  knowledge  is 


Dependency  Directed  Reasoning 


15.2  Apprcntice-Like  Systems  273 


represented  in  another  sub  system  called  LIBRA,  [Kant,  1 977 J.  The  entire  PSI  system 
has  been  capable  of  synthesizing  the  code  for  a  simple  version  of  a  learning  program. 

A  different  type  of  synthesis  system  has  been  developed  by 
[Manna  &  Waldinger,  1977)  in  which  far  more  reliance  is  placed  on  deductive 
capabilities.  Their  system  is  capable  of  synthesizing  a  program  to  satisfy  a  given  set 
of  input-output  specifications.  It  has  rules  for  loop,  recursion,  and  test  formation  as 
well  as  a  method  for  handling  destructive  interference  between  simultaneous  sub-goals. 
Another  recent  knowledge  based  system  is  the  SAFE  system  of  [Balzer,  et  aL,  1977) 
which  takes  informal  program  specifications  and  attempts  to  translate  these  into  a 
precise  formal  description  from  which  a  program  may  be  synthesized. 

An  excellent  treatment  of  many  aspects  of  program  verification  and  synthesis  can 
be  found  in  (Manna  &  Waldinger,  1978) 


For  Complex  Program  Understanding 


274 


A  Survey  of  Related  Work 


Section  15.3.*  Dependency  Based  Reasoning 

I  his  thesis  has  been  interlaced  with  many  references  to  the  works  of  my  close 
colleagues  at  the  MIT  Al  lab.  My  approach  to  developing  the  deductive  machinery  of 
REASON  has  been  influenced  heavily  by  the  AI  Lab's  intellectual  atmosphere. 
Dc|>ciHlency  based  systems  like  (Stallman  &  Sussinan,  1977)  and  (Doyle,  1978]  have 
strongly  influenced  the  redesign  of  REASON.  Another  dependency  based  system 
which  has  influence  my  thought  is  (London,  1977),  (Doyle,  78)  surveys  much  of  the 
current  literature  on  dependency  based  systems. 

I  he  idea  of  explicit  representation  of  control  as  a  stepping  stone  to  introspective 
systems  is  also  heavily  influenced  by  (Doyle,  1978]  as  well  as  by 
(DeKleer,  rt  al,  1977)  (McDermott,  1976]  introduced  the  task  network  formalism 
which  he  developed  considerably  further  in  his  NASL  system  than  I  have  yet  done  in 
REASON.  The  idea  of  this  form  of  organization  was  called  to  my  attention  by  Doyle 
(private  communication).  (Davis  1976]  uses  a  weak  form  of  self  reflection  in  a 
backward  chaining  system. 

A  strong  influence  on  my  approach  to  handling  side  effects  has  been  the 
considerable  AI  literature  on  the  frame  problem.  (McCarthy  &  Hayes,  1969)  and 
(Hayes,  I *>7 l.i. b)  are  good  introductions  to  this  issue.  (Raphael,  1970]  survey's  the 
known  techniques  for  handling  the  problem. 

REASON’S  entire  design  is  shaped  by  the  central  importance  given  to  plans 
within  the  apprentice  system  HACKER  (Sussinan,  1975)  introduced  many  of  the 
ideas  which  helped  us  develop  the  plan  formalism  Other  ideas  such  as  the 
categorization  of  plans  into  particular  types  first  appeared  in  (Goldstein,  1974)  The 
plan  formalism  as  we  now  use  it  first  appeared  in  (Rich  &  Shrobe,  1976)  Its  current 
form  was  heavily  influenced  by  Waters  work  in  (Waters,  1977)  (Sacerdoti,  1975,75a) 
developed  a  similar  formalism  as  part  of  a  plan  compilation  system  Johan  deKleer't 
(dcKIccr,  1976,77)  work  on  understanding  electronic  circuits  and  Allen  Brown's 
(Brown,  1977)  work  on  isolating  failures  in  a  circuit  have  also  influence  the  plan 
formalism. 


Dependency  Directed  Reasoning 


15.3  Dependency  Based  Reasoning  275 


(Miller  &  Goldstein,  1976.1, b, 77)  have  been  developing  a  notion  of  plan  for  use  in 
tutoring  sessions.  They  have  catalogued  very  general  problem  solving  strategies,  whose 
plans  they  represent  in  an  ATN-like  grammar.  The  grammar  is  used  to  parse  the 
protocol  of  a  student's  coiling  sessions,  so  that  the  computer  tutor  can  provide  advise. 
A  similar  methodology  is  used  in  (Genesereth,  1978]  to  help  experts  using  the 
MACSYMA  symbolic  manipulation  system  However,  in  this  case  the  assumption  is 
that  the  expert  has  correctly  formulated  a  plan,  but  has  based  his  plan  on  faulty 
knowledge  of  the  MACSYMA  facilities.  Genesereth's  system  functions  as  a 
MACSYMA  consultant,  not  a  tutor. 

The  earliest  mention  of  a  system  like  the  programmer's  apprentice  is  in 
(Floyd,  1971),  although  verification  and  related  techniques  were  not  yet  well  enough 
developed  to  do  much  with  the  proposal.  The  pressures  of  engineering  large  Artificial 
Intelligence  systems  led  to  another  exploration  of  the  idea  in  (Winograd,  1973). 
(Hewitt  &  Smith,  1975]  developed  the  idea  further  within  the  framework  of  the 
PI  ASM  A  programming  language.  Both  Hewitt  and  Smith  encouraged  me  and 
provided  some  initial  insights. 

1  I  vega  n  this  thesis  by  placing  it  within  the  context  of  a  developing  set  of 
techniques  which  I  hope  will  help  develop  truly  self-conscious  systems.  This  document 
is  not  intended  to  answer  many  of  the  difficult  problems  which  will  lie  on  that  course, 
but  only  to  develop  some  technical  foundations  in  program  understanding  which  will 
help  those  more  bold  than  L  Minsky  (Minsky,  1968)  and  McCarthy's 
(McCarthy,  19f>S)  ideas  of  a  decade  ago,  still  lie  ahead  of  us,  waiting  for  solution. 


For  Complex  Program  Understanding 


276  Bibliography 


Bibliography 

Balzer,  R.  1973  ‘’Automatic  Programming”,  Institute  Technical  Memo,  University  of 
Southern  California  /  Information  Sciences  Institute,  Los  Angeles,  CaL 

Balzer,  R.  et.  al.  1974  Domain- Independent  Automatic  Programming,  1SI/RR-73-14 
University  of  Southern  California  (March  1974). 

Balzer,  R;  Goldman,  N;  Wile,  D.  1978,  "Meta-Evaluation  as  a  Tool  for  Program 
Understanding",  ISI/RR-78-69,  January  1978. 

Bars  tow,  David  1977,  Automatic  Construction  of  Algorithms  and  Data  Structures,  PhD. 
Thesis,  Stanford  University,  September  1977. 

Barstow,  David  and  Kant,  Elaine  1976,  "Observations  on  The  Interaction  of  Coding 
and  Efficiency  Knowledge  in  the  PSI  Program  Synthesis  System",  Proceedings  of 
The  Second  International  Conference  on  Software  Engineering,  San  Francisco^ 
California,  October  1976,  pp  19  -  31. 

Basu,  S.  &  Misra,  J.  1975,  "Proving  Loop  Programs",  IEEE  Trans,  on  Software 
Engineering  VoL  1  Number  I,  March  1975. 

Basu,  S.  &  Misra,  J.  1976,  "Some  Classes  of  Naturally  Provable  Programs",  Second 
International  Conference  on  Software  Engineering ,  pp  400-406,  Oct  197b 

Bauer,  M.  1975  "A  Basis  for  the  Acquisition  of  Procedures  from  Protocob", 

'  Proceedings  of  The  Fourth  International  Joint  Conference  on  Artificial  Intelligence, 
Tbilisi,  Gerogia  USSR.  September  1975. 

Birtwistle,  G. M.;  Dahl,  Ole- Johan;  Myhrhaug,  R;  and  Nygaard,  K.  1973  SIMULA 
BEGIS  Auerbach.  197.3 

Bledsoe,  W.W.  1971,  "Splitting  and  Reduction  Heuristics  in  Automatic  Theorem 
Proving",  Aritifhal  Intelligence. ;  vol  2,  pp  55-77,  North  Holland  Publishing  Ca, 
Amsterdam,  1971. 


Dependency  Directed  Reasoning 


16  Bibliography  277 


Bledsoe,  W.W.  &  Bruell,  P.  1973  "A  Man-Machine  Theorem  Proving  System", 
Advance  Papers  of  The  Third  Iniernaiional  Joint  Conference  on  Artificial 
Intelligence  Stanford  University,  August  1973  pp  55-65;  also  Artificial 
Intelligence  vol  5.  no  1  pp  51-72,  Spring  1974. 

Bose,  A.  and  Stevens,  K  1965,  Introductory  Sctwork  Theory,  Harper  and  Row,  New 
York,  1965,  page  1. 

Boyer,  R.  S.  A  Moore,  J.S.  1975  "Proving  Theorems  About  LISP  Functions",  Journal 
of  the  Association  for  Computing  Machinery  voL  22  no  1,  January  1975. 

Boyer,  R.S.  A  M<v>re,  J.S  1977,  "A  Lemma  Driven  Automatic  Theorem  Prover  for 
Recursive  Function  Theory",  Proceedings  of  the  Fifth  International  Joint 
Conference  on  Artificial  Intelligence  Cambridge  Mass,  pp  511-519,  August  1977. 

Brown,  A.  L  1^77  Qualitative  Knowledge,  Causal  Reasoning,  and  the  Localization  of 
Failures,  M.LT.  Artificial  Intelligence  Laboratory  Technical  Report  362,  March 
1977. 

Bundy,  A.  1973,  "Doing  Arithmetic  With  Diagrams",  Advance  papers  of  the  Third 
International  Joint  Conference  on  Artificial  Intelligence  Stanford  University  pp 
130- 138,  August  1 97  3. 

Burstall,  R.  M.  19.4,  "Program  Proving  As  Hand  Simulation  and  A  Little  Irduction", 
Proceedings  of  IFIP  Conference  1974. 


Burstall,  R.  \L  1969  "Proving  Properties  of  Programs  by,  Structural  Induction", 
Computer  Journal  voL  12,  pp  4-8 

Burstall,  R.  M  I  °72  "Some  Techniques  for  Proving  Properties  of  Programs  Which 
Alter  Data  Structures",  Machine  Intelligence  7,  Edinburgh  Llniversity  Press. 

Captain,  M.  1975,  "Finding  Invariant  Assertions  for  Proving  Programs",  Proc. 
International  Conference  on  Reliable  Software \  Los  Angeles  Calif.  April  1975. 


For  Complex  Program  Understanding 


278  Bibliography 


Cartwright,  R  ami  Oppen,  D.  1978,  "Unrestricted  Procedure  Calls  in  Hoare's  Logic", 
Conference  Record  of  the  Fifth  Annual  ACM  Symposium  on  Principles  of 
Programming  Languages,  Tucson,  Arizona,  Jan.  1978.  pp  131-140. 

Clarke,  Lori  A.  1976,  "A  System  to  Generate  Test  Data  and  Symbolically  Execute 
Programs",  IEEE  Transactions  on  Software  Engineering  voL  SE-2  no.  3  September 
1976  pp  215-222. 

Cooper,  DC  1971,  "Programs  for  Mechanical  Program  Verification"  Machine 
Intcligcnce  6,  American  Elsiver,  New  York,  1971  pp  3-59. 

Cousot,  P  and  Halbnachs,  N.  1978,  "Automatic  Discovery  of  Linear  Restraints  Among 
Variables  of  a  Program",  Conference  Record  of  the  5th  Annual  ACM  Symposium 
on  Principles  of  Programming  Languages,  pp  84-96,  Jan.  1978. 

Dahl,  O.J.,  Dtjkstra,  E,  And  Hoare,  CAR.  1972  Structured  Programming, 
Academic  Press,  1971 

Darlington,  J.  and  Burstall,  R.M.  1973  "A  System  Which  Automatically  Improves 
Programs",  Proceedings  of  the  Third  International  Joint  Conference  on  Artificial 
Intelligence,  Stanford  University,  August  1973. 

Darlington,  Jared  L  1973a  "Automatic  Program  Synthesis  in  Second-Order  Logic", 
Proceedings  of  the  Third  International  Joint  Conference  on  Artificial  Intelligence, 
Stanford  University,  August  1973  — 

Davis,  R  1976,  "Application  of  Meta-Level  Knowledge  to  The  Construction, 
Maintenance,  and  L|se  of  a  Large  Knowledge  Base",  Stanford  AIM-238,  197A 

DeMillo,  R  A.jl  ipton,  R  and  Perils,  A.  1977,  "Social  Processes  in  Proving  Theorem* 
and  Programs",  Proceedings  of  the  Fourth  ACM  Symposium  on  Principles  of 
Programming  Languages,  Los  Angeles,  1978. 

Dershowitr,  N.  and  Manna  7-  1974,  "The  Evolution  of  Programs;  Automatic  Program 
Mollification",  IEEE  Transactions  on  Software  Engineering,  vol  SE-3  no.  6.  Nov. 
1977,  pp  377-  385. 


Dependency  Directed  Reasoning 


AD-A078  055  MASSACHUSETTS  INST  OF  TECH  CAMBRIDGE  ARTIFICIAL  INTE— ETC  F/6  6/4 

DEPENDENCY  DIRECTED  REASONING  FOR  COMPLEX  PROGRAM  UNDERSTANDING— ETC (U) 
APR  79  H  E  SHROBE  N00014-75-C-0643 

UNCLASSIFIED  AI-TR-503  NL 


V 


16  Bibliography  279 

Dershowitz,  N.  and  Manna,  Z.  1974,  "Inference  Rules  for  Program  Annotation", 
Stanform  AIM- 303,  October  1977. 

Deutsch,  LP.  1973,  An  Interact!* e  Program  Verifier ,  PhD.  Thesis  University  of 
California  at  Berkeley,  June  1973. 

Dijkstra,  LW.  1975,  "Guarded  commands,  nondeterminacy  and  formal  derivation  of 
programs'*,  Communications  of  the  ACM,  voL  18,  pp  453-457,  Aug.  1975. 

Dijkstra,  E.W.  1976,  A  Discipline  of  Programming,  Prentice- Hall,  Englewood  Cliffs, 
N.J.  1976 

DeKleer,  Johan  1976,  "Local  Methods  for  Localization  of  Faults  in  Electronic 
Circuits",  M.LT.  Artificial  Intelligence  Laboratory  Memo  394. 

DeKleer,  Johan  1977,  "A  Theory  of  Plans  for  Electronic  Circuits",  MIT  Artificial 
Intelligence  Laboratory  Working  Paper  144,  May  1977. 

DeKleer,  J.,  Doyle,  J.,  Steele,  G.  &.  Sussinan,  G.J.  "AMORD  Explicit  Control  of 
Reasoning",  Proce*’dings  of  the  Symposium  on  Artificial  Intelligence  and 
Programming  Languages,  Uni icrsity  of  Rochester,  August  1977. 

Dolotta,  T.  A.  and  Mashey,  J.  R.  1976,  "An  Introduction  to  the  Programmer's 
Workbench",  Proceedings  of  the  Second  International  Conference  on  Software 
Engineering,  San  Francisco,  CaL,  October  1976,  pp  164-168. 

Donzea-Gouge,  V.,  iluct  G.,  Kahn,  G.,  Lang,  B,  and  Levy,  J.J.  1975  "A 
Structure-Oriented  Program  Editor-  A  First  Step  Towards  Computer  Assisted 
Programming",  Report  114,  Institut  de  Recherche  en  Informatique  et 

Automatique,  France. 

Doyle,  Jon  1977,  "More  Explicit  Control  of  Reasoning",  upublished  manuscript 

Doyle,  Jon  1978  Truth  Maintenance  Systems  for  Problem  Solving,  M.LT.  Artificial 
Intelligence  Laboratory  Technical  Report  419,  January  1978. 


For  Complex  Program  Understanding 


280  Bibliography 


Doyle,  Jon  1978b,  "Reflexive  Interpreters",  Proposal  for  Research  Leading  to  the 
degree  of  Ph.D.  MLT.  Dept,  of  EE&CS,  June  1978. 

Elspas,  Bernard,  1974,  "The  Semi-Automatic  Generation  of  Inductive  Assertions  for 
Proving  Program  Correctness",  SRI  Project  2686,  July  1974. 

Elspas,  B  ;  Green,  M;  Levitt,  K.  and  Waldinger,  R.  1972,  Research  in  Interactive 
Program  Proving  Techniques,  SRI  Project  8398  Phase  II  Report,  May  1972. 

Elspas,  B,  Levitt,  K.N.,  A  Waldinger,  R.J.,  An  Interactive  system  for  The  Verification 
of  Computer  Programs,  Stanford  Research  Institute  Project  1891  Final  Report, 
September  1973. 

Fikes  Richard  and  Nilsson,  Nils  1971,  "STRIPS:  A  New  Approach  to  the  Application 
of  Theorem  Proving  to  Problem  Solving".  Proceedings  of  the  Second  International 
Joint  Conference  on  Artificial  Intelligence  page  608. 

Floyd,  R  W.  1967  "Assigning  Meaning  to  Programs",  in  Mathematical  Aspects  of 
Computer  Science,  J.T.  Schwartz  (ed.)  vol  19,  Am.  Math.  Soc.  pp  19-32. 
Providence  Rhode  Island. 

Floyd,  R.W.  1971  "Toward  Interactive  Design  of  Correct  Programs",  IFIP,  1971. 

Gencsereth,  Michael  1978,  Automated  Consultation  for  Complex  Computer  Systems, 
PhD  thesis  The  Division  of  Applied  Sciences,  Harvard  Unviersity,  May  1978. 

Gerhart,  &  L  1975a,  "Knowledge  About  Programs:  A  Model  and  Case  Study", 
SIGPLAS  Sot  ices,  Vol  10,  Num.  6,  Proceedings  of  the  International  Conference 
on  Reliable  Software. 

Gerhart,  S.L.  1 975b  "Correctness  Preserving  Program  Transformations",  Proceedings  of 
the  Second  Symposium  on  Principles  of  Programming  Languages,  Palo  Alta 

German,  S.  1974,  A  Program  Verifier  That  Generates  Inductive  Assertions,  Center  For 
Research  in  Computing  Technology,  Harvard  University,  Cambridge  Mam. 
Tech.  Report  TR- 19-74,  Aug  1974 


Dependency  Directed  Reasoning 


16  Bibliography  291 


German,  &  &.  Wegbreit,  R  1975,  "A  Synthesizer  of  Inductive  Assertions",  IEEE 
Transactions  on  Software  Engineering  voL  1  num.  1,  March  1973. 

Goldstein,  Ira  1974,  Understanding  Simple  Picture  Programs,  MIT  Artificial 
Intelligence  Laboratory  Technical  Report  294,  September  1974 

Goldstein,  Ira  1976,  “The  Computer  As  Coach’,  MIT  Artificial  Intelligence  Laboratory 
Memo  .189,  December  1976 

Goldstein,  IP.  and  Miller,  ML  1976  "Structured  Planning  and  Debugging,  A 
Linguistic  Theory  of  Design",  MIT  AI  Lab  Memo  187.  December,  1976. 

Good,  Donald;  London,  Ralph,  &  Bledsoe  W.W.  "An  Interactive  Program  Verification 
System",  IEEE  Transactions  on  Software  Engineering,  vol  SE-1,  number  1, 
March  1975. 

Gordon,  M.;  Milner,  R.;  Morris,  L;  Newey,  M,  and  Wadworth,  G  1978,  "A 
Metalanguage  for  Interactive  Proof  in  LCP,  Conference  Record  of  the  Fifth 
ACM  Symposium  on  Principles  of  Programming  Languages,  Tucson  Arizona,  Jaa 
1978. 

Green,  G.G  1976,  "The  Design  of  The  PSI  Program  Synthesis  System",  Proceedings  of 
The  Second  International  Conference  on  Software  Engineering,  San  Francisco, 
October  1976,  pp  4-18. 

Green,  G.C.  1977,  "A  Summary  of  The  PSI  Program  Synthesis  System",  Proceedings  of 
The  Fifth  Internationa!  Joint  Conference  on  Artificial  Intelligence,  Cambridge, 
Massachusetts,  August  1977,  pp  180  -  181. 

Green,  G.G  A  Barstow,  D. R.  "Some  Rules  for  the  Automatic  Synthesis  of  Programs", 
Proceedings  of  the  Fouth  International  Joint  Conference  on  Artificial  Intelligence 
Tbilisi,  Georgia,  USSR,  September  1975. 

Greif,  L  and  Waldmger  R.  1974,  "A  More  Mechanical  Heuristic  Aproach  to  Program 
Verification",  Proceeding  of  International  Symposium  on  Programming,  Paris, 
April  1974,  pp  8.V9Q. 


For  Complex  Program  Understanding 


282  Bibliography 


Hoare,  CA  R.  and  Wirth,  N.  1973  "An  Axiomatic  Definition  of  the  Programming 
Language  PASCAL",  Ada  Information,  2,4,  pp  335-355. 

Howdcu,  William  E  1977,  "Symbolic  Testing  and  the  DISSECT  Symbolic  Evaluation 
System",  1F.EF  Transactions  on  Software  Engineering,  voL  SE-3  no  4,  July  1977 
pp  266- 278. 

Howden,  Willum  E  1978,  "DISSECT  -  A  Symbolic  Evaluation  and  Program  Testing 
System",  IEEE  Transactions  on  Software  Engineering  voL  SE-4  no.  1  Januaray 
1978. 

Igarashi  S. ,.  London  R.,  and  Luckhain  D.  1973,  Automatic  Program  Verification  I:  A 
Logical  Basis  and  Its  Implementation,  Stanford  AIM-200,  May  1973. 

Jones,  N.  and  Muchmk  S.  1978,  "Even  Simple  Programs  Are  Hard  To  Analyze", 
Journal  of  the  ACM  vol  24,  no  2  April  1977  pp  338- 35Q 

Kant,  E  1977,  "The  Selection  of  Efficient  Implementations  for  A  High  Level 
Language",  Proceedings  of  the  Symposium  on  Artificial  intelligence  and 
Programming  Languages,  University  of  Rochester  August  1977. 

Katz,  &M.,  and  Manna,  Z.  1973,  "A  Heuristic  Approach  to  Program  Verification”, 
Third  International  Joint  Conference  on  Artificial  Intelligence,  Stanford  U. 
Septermber  1973. 

Katz,  &  A  Manna,  Z.  1976,  "Logical  Analysis  of  Programs",  Communications  of  the 
ACM,  Vol  19  Nuia  4  pp  188-206  April  1976. 

King,  J.  J9&9  4  Program  Verifier,  Carnegie  Mellon  University,  1969. 

King,  J.C  1971  "Proving  Programs  to  be  Correct",  IEEE  Transactions  on  Computers, 
C-20,  11,  Nov.  1971. 

King.  J.C.  1976  "Symbolic  Execution  and  Program  Testing",  Communications  of  the 
ACM,  July,  Vol  19,  Na  7,  p  385. 


Dependency  Directed  Reasoning 


16  Bibliography  283 

Hoare,  CAR.  and  Wirih,  N.  1973  "An  Axiomatic  Definition  of  the  Programming 
Language  PASCAL",  Acta  Informatica,  2,4,  pp  335-355. 

Howden,  William  E  1977,  "Symbolic  Testing  and  the  DISSECT  Symbolic  Evaluation 
System",  IEEE  Transactions  on  Software  Engineering,  voL  SE-3  no.  4,  July  1977 
pp  266-278. 

Howden,  William  E  1978,  "DISSECT  -  A  Symbolic  Evaluation  and  Program  Testing 
System",  IEEE  Transactions  on  Software  Engineering  voL  SE-4  rwx  1  Januaray 
1978. 

Igarashi  S.,  London  R.,  and  Luckham  D.  1973,  Automatic  Program  Verification  I:  A 
Logical  Basis  and  Its  Implementation,  Stanford  AIM- 200,  May  1973. 

Jones,  N.  and  Muchnik  S  1978,  "Even  Simple  Programs  Are  Hard  To  Analyze", 
Journal  of  the  ACM  vol  24,  no  2  April  1977  pp  338-350 

Kant,  E  1977,  "The  Selection  of  Efficient  Implementations  for  A  High  Level 
Language",  Proceedings  of  the  Symposium  on  Artificial  Intelligence  and 
Programming  Languages,  University  of  Rochester  August  1977. 

Katz,  &  M. ,  and  Manna,  Z.  1973,  "A  Heuristic  Approach  to  Program  Verification", 
Third  International  Joint  Conference  on  Artificial  Intelligence,  Stanford  U. 
Septermber  1973. 

Katz,  S.  &.  Manna,  Z.  1976,  "Logical  Analysis  of  Programs",  Communications  of  the 
ACM,  VoL  19  Num.  4  pp  188-206  April  1976. 

King,  J.  1969,  A  Program  Verifier,  Carnegie  Mellon  University,  1969. 

King,  J.C.  1971  "Proving  Programs  to  be  Correct",  IEEE  Transactions  on  Computers, 
C-20,  11,  Nov.  1971. 

King,  J.C  1976  "Symbolic  Execution  and  Program  Testing",  Communications  of  the 
ACM,  July,  Vol  19,  No  7,  p  385. 


For  Complex  Program  Understanding 


284  Bibliography 

Knuth,  O.F.  1968  The  Art  of  Computer  Programming,  VoL  I,  Addison- Wesely. 

Lipton,  R.J.  1977,  “A  Necessary  and  Sufficient  Condition  for  the  Existence  of  Hour 
Logics",  XEROX  PARC  CSL-77-4,  June  1977. 

Liskov,  B.  1974,  "A  Note  on  CLU",  MIT/Computation  Structures  Group  Memo  112, 
MIT/LCS,  November  1974. 

Liskov,  R;  Snyder,  Alan;  Atkinson,  Russell;  and  Schaffert,  Craig;  1977,  -Abstraction 
Mechanisms  in  CLU”,  Communications  of  the  ACM,  August  1977,  ppi  564  -  576. 

Liskov,  B  and  Berzins,  V.  “An  Appraisal  of  Program  Specifications-  M.LT. 
Computation  Structures  Group  Memo  141-1.  April  1977. 

Liskov,  B  and  Zilles  S.  1974  "Programming  with  Abstract  Data  Types",  Proc,  of 
Conf.  on  Very  High  Level  Languages,  SIGPLAN  Notices,  Vol  9,  Na  4 

Liskov,  B  &  Zilles,  S.  N.  1975,  "Specification  Techniques  for  Data  Abstractions”, 
IEEE  Transactions  on  Software  Engineering,  Vol  SE-1  Na  1,  March  1975. 

Litvintchouk,  S.  D  and  Pratt,  V.R.  1977,  A  Proof  Checker  for  Dynamic  Logic,  MIT 
A!  Memo  429,  June  1977. 

London,  P.  1977,  "A  Dependency- Based  Modelling  Mechanism  for  Problem  Solving", 
Univ.  of  Maryland,  Computer  Science  Dept  TR-589,  Nov.  1977. 

London,  R.  1975  “A  View  of  Program  Verification",  ACM  SIGPLAN  Notices,  Vol  10, 
Na  6,  Proc.  of  International  Conf.  on  Reliable  Software. 

Long,  W.J.  1977  "A  Program  Writer”.  MIT/LC&fTR-107,  November  1977  (Ph.Dl 
Thesis) 

Luckhain,  David  G  &  Suzuki,  Norihisa,  "Automatic  Program  Verification  IV:  Proof  of 
Termination  Within  a  Weak  Logic  of  Programs’,  Stanford  AIM-269,  October 
1975. 


Dependency  Directed  Reasoning 


16  Bibliography  285 


Luckham,  David  C.  Si  Suzuki,  Norihisa  1976,  "Automatic  Program  Verification  V: 
Verification-Oriented  Proof  Rules  for  Arrays,  Records  and  Pointer*",  Stanford 
AIM-278,  March  1976 

Macsyina  1975,  The  MACSYMA  reference  Manual,  The  Mathlab  Group,  MIT 
Laboratory  for  Computer  Science,  November  1975. 

Manna,  Z.  1%9,  "Properties  of  Programs  and  The  First-order  Predicate  Calculus", 
Journal  of  ihe  ACM  vol  16,  no  2,  pp  244-249  244-255,  April  1969. 

Manna,  7.  1969,  "The  Correctness  of  Programs",  Journal  of  Computational  Systems 
Sciences,  vol.  .1  no  2  pp  119-127,  May  1969. 

Manna,  Z.  and  Pncult,  A.  1969,  "Formalization  of  Properties  of  Recursively  Defined 
Functions",  Proceedings  of  the  ACM  Symposium  of  Theory  of  Computing  pp. 
201-210,  ACM  New  York  1969. 

Manna,  Z.  and  Pnueli,  A.  1970,  "Formalization  of  Properties  of  Functional  Programs", 
Journal  of  the  ACM  voL  17  no  3  pp  555-569  July  197(1 

Manna,  Z.  and  Pnueli,  A.  1974,  "Axiomatic  Approach  to  Total  Correctness  of 
Programs",  Acta  Informatica  3,  pp  253-263,  1974. 

Manna,  Z.  and  Waldmger,  R.  1975,  “Knowledge  and  Reasoning  in  Program  Synthesis", 
Artificial  Intelligence  b,  pp  175-208. 

Manna,  Z.  and  Waldinger,  R.  1976,  "Is  ’sometime'  sometimes  better  than  'always?? 
Intermittent  assertions  in  proving  program  correctness"  Proceedings  of  the  Second 
International  Conference  on  Software  Engineering,  October  1976 

Manna,  Z.  and  Waldinger,  R.  1977,  Synthesis:  Dreams  =>  Programs,  Stanford 
Research  Institute  Technical  Note  156,  November  1977. 

Manna,  Z.  and  Waldinger,  R.  1978,  "The  Logic  of  Computer  Programming",  IEEE 
Transactions  on  Software  Engineering,  vol  SE-4  no.  3,  May  1978,  pp  199-229. 


For  Complex  Program  Understanding 


286  Bibliography 


McCarthy,  J.  1962a,  "Computer  Programs  for  Checking  Mathematical  Proof*"  in 
Recursive  Function  Theory,  Proceedings  of  Symposia  in  Pure  Mathematics,  voL  5, 
Am  Math.  Soc 

McCarthy,  J.  1962b,  Towards  a  Mathematical  Theory  of  Computation.  Proceedings  of 
1962  IFIP 

McCarthy,  J.  1 963,  A  Basis  for  a  Mathematical  Theory  of  Computation,  in  Computer 
Programming  and  Formal  Systems,  P.  BraffortA  D  Hershberg  eds. 
North-Holland,  Amsterdam  1963. 

McCarthy,  J.  1%4,  A  Formal  Description  of  a  Subset  of  Algol,  Proceeding  of  the 
conference  on  Formal  Language  Description  Languages,  Vienna  1964. 

McCarthy  J.  and  Painter  J.A.  1966,  "Correctness  of  A  Compiler  for  Arithmetic 
Expressions",  Stanford  University  Technical  Report  CS38,  April  29,  1966,  also  in 
Proceeding  of  a  Symposium  in  Applied  Mathematics,  vol.  19  --  Mathematical 
Aspects  of  Computer  Science,  pp  33-41,  J.T.  Schwartz  ed  Am.  Math.  Soc 
Providence  R.L 

McCarthy,  J.  1963,  "Towards  a  mathematical  science  of  computation"  Proceedings  of 
IFIP  Congress  62,  pp  21-28,  Amsterdam:  North  Holland 

McCarthy,  J.  and  Hayes,  P.  1969  "Some  Philosophical  Problems  from  the  Standpoint 
of  Artificial  Intelligence",  Machine  Intelligence  4,  American  Elsevier,  N.Y. 

McDermott,  Drew  Vincent  1976,  Flexibility  and  Efficiency  in  a  Computer  Program  for 
Designing  Circuits,  MIT  PhD.  Thesis,  September  1976 

Mikelsons  M  1 97?  "Computer  Assisted  Application  Definition",  Proceedings  of  the 
Second  ACM  Symposium  on  Principles  of  Programmings  Languages,  Palo  Alta 

Miller,  M.  L  and  Goldstein,  LP.  1976a  "SPADE  A  Grammar  Based  Editor  For 
Planning  and  Debugging  Programs",  MIT  A!  Lab  Memo  386  December  1976 


Dependency  Directed  Reasoning 


16  Bibliography  28' 


Miller,  \t  L  and  Goldstein,  l.P.  1977a  “Structured  Planning  and  Debugging" 
Proceedings  of  the  Fifth  International  Joint  Conference  on  Artificial  Intelligence, 
MIT,  August  1977. 

Miller,  VI.  L  and  Goldstein,  l.P.  1977b  "Problem  Solving  Grammars  as  Formal  Tools 
for  Intelligent  CAP  ACM 77 ,  October,  1977. 

Milner,  R.  1972a,  "Logic  for  Computable  Functions:  Description  of  a  Machine 
Implementation",  Stanford  AIM- 169. 

Milner,  R.  |972h,  "Implementation  ami  Applications  of  Scott’s  Logic  for  Computable 
Functions",  Proceedings  of  an  ACM  Conference  on  Proving  Assertions  about 
Programs,  SIGPl.AS  Notices  vol  7,  no.  1,  pp  1-6. 

Milner,  R.  and  Weyrauch,  R.  1972,  "Proving  Compiler  Correctness  in  a  Mechanired 
Logic",  Machine  Intelligence  7  Meltier  and  Michie  eds,  pp  51-70  Edinburgh 
University  Press,  1972. 

Minsky,  Marvin  1961,  "Steps  Towards  Artificial  Intelligence",  Proceedings  IRE,  VoL 
49,  No.  1,  1961. 

Minsky,  Marvin  1961,  "A  LISP  Gargage  Collector  Using  Serial  Secondary  Storage", 
MIT  AI  Memo  No.  58  (revised)  1963. 

Minsky,  Marvin  1968,  "Descriptive  Languages  and  Problem  Solving"  and  "Matter, 
Mind,  and  Models"  in  Semantic  Information  Processing ,  Marvin  Minsky  ed.,  The 
MIT  Press,  Cambridge  Mass,  July  1968. 

Minsky,  M.,  "Form  and  Content  in  Computer  Science",  Journal  of  the  ACM  17,  No. 
2,  April  1970,  pp  197-215,  1970  ACM  Turing  Lecture. 

Moore,  JS.  1974,  "Introducing  PROG  into  the  PURE  LISP  Theorem  Prover",  Xerox 
PARC  Report  CSL-74-.l 

Moore,  Robert  Carter  1975,  Reasoning  From  Incomplete  Knowledge  In  A  Procedural 
Deduction  System,  MIT/AI-TR-147  December  1975. 


For  Complex  Program  Understanding 


298  Bibliography 


Vloricom,  M.  1  *>74,  "Towards  the  Interative  Synthesis  of  Assertions”,  Res.  Report, 
University  of  Texas  at  Austin,  Oct.  1974. 

Monconi,  M.  1977,  A  System  For  Incrementally  Designing  and  Verifying  Programs, 
ISI/RR  77-65,  November  1977;  also  PhD  Thesis  University  of  Texas,  1977. 

Morris,  .1.11.  A  Wcgbreit,  B.  1977,  "Subgoal  Induction",  Communications  of  the  ACM, 
Vol.  290  Num.  4  pp  209-222,  April  1977. 

Naur,  P.  1966,  "Proofs  of  Algorithms  by  General  Snapshots',  BIT  voL  6,  pp  310-316, 
1966. 

Nelson,  G  and  Oppen  D.  1978,  "A  Simplifier  Based  on  Efficient  Decision  Algorithms", 
Conference  Rixord  of  the  Fifth  ACM  Sympsium  on  Principles  of  Programming 
Languages ,  Tucson,  Arizona  Januaray  1978  pp  141-130 

Newell,  A.  Shaw,  J.C.  and  Simon  H.  1959,  "Report  on  a  General  Problem  Solving 
Program",  Proceedings  of  the  International  Conference  on  Information  Processing, 
Paris  UNESCO  House,  1959. 

Oppen,  IVrck  1975,  On  Logic  and  Program  Verification,  University  of  Toronto 
Technical  Report  82,  (also  PhD  dissertation)  April  1975. 

Oppen,  Derek  1978,  "Reasoning  Aboul  Recursively  Defined  Data  Structures", 
Conference  RFxord  of  the  Fifth  ACM  Sympsium  on  Principles  of  Programming 
Languages,  Tucson,  Arizona  Januaray  1978,  pp  151-157. 

Parnas,  O.  I.  1972,  "A  Technique  for  Software  Module  Specification  with  Examples", 
Communications  of  the  ACM  Vol  15,  No  5. 

Paterson,  Nl.  and  Hewitt,  C  1970  "Comparative  Schematology",  Conference  Record 
ACM  Conference  on  Concurrent  Systems  and  Parallel  Computation  (19701. 

Pnueli,  Amir  1977,  "The  Temporal  Logic  of  Programs",  1 8th  Annual  Symposium  of 
Foundations  of  Computer  Science  pp46-57,  Oct  77. 


Dependency  Directed  Reasoning 


16  Bibliography  289 


Pratt,  V.  1976,  "Semantical  Considerations  on  Floyd-Hoare  Logic",  MIT/LCS/TR-168, 
September  1976;  also  in  Procei'dings  of  the  17th  Annual  IEEE  Symposium  on 
Foundations  of  Computer  Science,  pp  109-121,  1976. 

Pratt,  V.  1978,  "Process  logic"  unpublished  manuscript,  May  1978. 

Ragland,  L.C.,  A  Verified  Program  Verifier  197.1,  PhD.  dissertation,  University  of 
Texas,  Austin,  1971. 

Raphael,  B.  1 970,  "The  Frame  Problem  in  Problem-Solving  Systems",  SRI  Technical 
Note  11,  June  1970. 

Rich,  C.  1977,  Plan  Recognition  In  A  Programmer's  Apprentice",  MIT/AI  Working 
Paper  147,  May  1977. 

Rich,  C.  1979  forthcoming,  "A  Representation  for  Programming  Knowledge  and 
Applications  to  Recognition,  Generation,  and  Cataloguing  of  Programs",  PhD. 
thesis  forthcoming,  January  1979. 

Rich  G  and  Shrobe  H.  1976,  An  Initial  Report  On  A  I.ISP  Programmer's  Apprentice, 
MIT/AI/TR  1<4,  December  1976 

Robinson,  L  and  I  e\itt,  K  1977,  Communications  of  the  ACM  vol  20,  na  4  April 
1977,  pp  27 1 -281. 

Ruth,  Gregory  1971,  Analysis  of  Algorithm  Implementations  M.LT.  Ph-d.  Thesis, 
Project  MAC  Technical  Report  110 

Ruth,  Gregory  1 976a,  "Intelligent  Program  Analysis",  Artificial  Intelligence  7,  Spring 
1976,  pp  6*  -  8?. 


Ruth,  Gregory  1976b,  "Profosystem  I:  An  Automatic  Programming  System  Prototype", 
TM  72,  MIT  Laboratory  for  Computer  Science,  1976  (also  in  the  Proceedings  of 
the  I97R  A’CC  in  abbreviated  form). 


For  Complex  Program  Understanding 


290  Bibliography 


Ruth,  Gregory.  1976c,  "Automatic  Design  of  Data  Processing  Systems",  ACM 
Symposium  on  Principles  of  Programming  Languages,  1976. 

Sacerdoli,  Earl  D  1973,  "Planning  in  a  Hierarchy  of  Abstract  Spaces",  Proceedings  of 
the  Thnd  International  Joint  Conference  on  Artificial  Intelligence  Stanford 
University,  September  1973,  pp.  412-422. 

Sacerdofi,  Earl  D.  1973,  "The  Non-Linear  Nature  of  Plans"  Stanford  Research 

Institute  A.  I.  Group  Technical  Note  101. 

Sacerdofi,  Earl  D.  1973a,  "A  Structure  for  Plans  and  Behavior",  SRI  Technical  Note 
109. 

Schwartz,  J.  f.  1973,  On  Programming,  An  Interim  Report  on  the  SETL  Project ; 
Installment  I  Generalities,  Courant  Institute  of  Mathematical  Sciences,  New 
York  University,  February  1973. 

Schwarts  J.T  1977,  "On  Correct  Program  Technology",  in  Courant  Computer  Science 
Report  #12,  September  1977. 

Scott,  D  1972  "lattice  Theory,  Data  Types  and  Semantics",  in  Formal  Semantics  of 
Programming  Languages,  Rustin  ed,  Prentice-Hall,  p  65. 

Shaw,  11,  Swartout,  W.,  and  Green,  C  1975  "Inferring  LISP  Programs  from 
Examples",  Proceedings  of  the  Fourth  International  Joint  Conference  on  Artificial 
Intelligence,  Tbilisi  Georgia,  U  SSR  August  1975. 

Shrobc,  Howard  E.  1978,  "Plan  Verification  in  A  Programmer's  Apprentice",  M.LT. 
Artificial  Intelligence  laboratory  Working  Paper  #  158,  January  1978. 

Siklossky,  L.  and  Sykes  D.  1975  "Automatic  Program  Synthesis  from  Example 
Problems",  Fourth  International  Joint  Conference  on  Artificial  Intelligence  Tbilisi 
Georgia,  USSR,  August  1975. 

Spitzcn,  J  ft  Wegbreit,  B.  1975,  "The  Verification  and  Synthesis  of  Data  Structure*" , 
Acta  Informatica  4,  1975. 


Dependency  Directed  Reasoning 


16  Bibliography  291 


Stallman,  Richard  and  Sussman,  G.  J.  1977,  “Forward  Reasoning  and 
Dependency- Directed  Backtracking  In  A  System  for  Computer-Aided  Circuit 
Analysis",  Artificial  Intelligence  Journal,  October  1977. 

Steele,  Guy  L  1977,  “Debunking  the  'Expensive  Procedure  Call*  Myth",  Proceedings  of 
ACM  77,  October  1977. 

Strong,  H  R.  1970,  Translating  Recursion  Equations  into  Flow  Charts,  Reports  RC 
2834  (March  1970)  and  RC  2859  (April  1970),  IBM  Yorktown  Heights. 

Suppes,  Patrick  1957,  Introduction  to  Logic,  Van  Nostrand,  New  York,  1957. 

Sussman,  G.J.  1973,  A  Computational  Model  of  Skill  Acquisition,  M.LT.  Department 
of  Mathematics  Ph.D.  Thesis;  M.I.T.  Artificial  Intelligence  Laboratory  Technical 
Report  297,  August  1973;  A  Computer  Model  of  Skill  Acquisition,  New  York, 
American  Elsiver  1975. 

Sussman,  G.J.  1977a,  "Electrical  Design:  A  Problem  for  Artificial  Intelligence 
Research",  Prouedmgt  of  The  Fifth  International  Joint  Conference  on  Artificial 
Intelligence,  Cambridge,  Massachusetts,  August  1977. 

Sussman,  G.  J.  1977b,  "SI  ICES:  At  The  Boundary  Between  Analysis  and  Synthesis", 
M.LT.  Artificial  Intelligence  Laboratory  Memo  433,  July  1977.  (Also  to  appear 
in  The  Proceedings  of  The  IFIP  Harking  Conference  on  Artificial  Intelligence  and 
Pattern  Retognmon  in  Computer  Aided  Design  in  1978.) 

Sussman,  G.  J.  and  Stallman  Richard  1975,  "Heuristic  Techniques  in  Computer  Aided 
Circuit  Analysis",  IF.F.E  Transactions  on  Circuits  and  Systems,  VoL  CAS-22,  No. 
11,  November  1975. 

Suzuki,  N.  1975,  "Automatic  Program  Verification  \\  Verifying  Programs  by  Algebraic 
and  Logical  Reduction",  Stanford  AIM-255,  December  1974. 

Suzuki,  N  1976,  "Automatic  Verification  of  Programs  with  Complex  Data  Structures", 
Stanford  AIM-279,  February  1976, 


For  Complex  Program  Understanding 


292  Bibliography 


Teitelman,  W.  et  aL  1975,  Intcrlisp  Reference  Manual,  Xerox  Palo  Alto  Research 
Center,  Dec  1975. 

Teitelman,  W.  1977  "A  Display  Oriented  Programmer's  Assistant"  Proceedings  of  the 
Fifth  International  Joint  Conference  on  Artificial  Intelligence,  MIT,  August  1977. 

von  Henke,  F.  1975,  “On  the  Representation  of  Data  Structures  in  LCF  with 
Applications  to  Program  Generation",  Stanford  AIM-267,  September  1975. 

Von  Neumann,  J.,  Goldtme  "Planning  and  Coding  Problems  for  an  Electronic 
Computer  Instrument,  part  2  vol  1-3",  John  ton  Neumann  Collected  Works  vol  3, 
pp  80-235  (Pergainon  Press,  New  York,  1963). 

Wakimger,  R.  and  Levitt,  K.N.  1974  "Reasoning  About  Programs",  Artificial 
Intelligence  5,  pp.  235-316,  also  SRI  Technical  Note  86,  October  197 31 

Waldinger,  Richard  1975,  "Achieving  Several  Goals  Simultaneously"  Stanford  Research 
Institute  AL  Group  Technical  Note  107. 

Waters  RC.  1976,  "A  System  for  Understanding  Mathematical  FORTRAN 

Programs",  M.LT.  Artificial  Intelligence  Laboratory  Memo  368,  August  1976. 

Waters  R.C  1977,  “A  Method,  Based  on  Plans  for  llnderstanding  How  a  Loop 
Implements  a  Computation",  MLT.  Artificial  Intelligence  Laboratory  Working 
Paper  150,  July  1977. 

Waters  R  C.  1978,  A  Method  For  Automatically  Analyzing  the  Logical  Structure  of 
Programs,  PhD.  Thesis  M.LT.  (forthcoming)  EL&CS,  Sept  1976 

Wegbrnt,  B  1973,  "Heuristic  Methods  for  Mechanically  Deriving  Inductive 
Assertions",  Proceedings  of  the  Third  International  Joint  Conference  on  Artificial 
Intelligence,  Stanford  University,  Septermber  1973. 

W'egbreit,  R  1974,  "The  Synthesis  of  Loop  Predicates’,  Communications  of  The  ACM, 
Vol  17,  pp  102-112,  Feb  1974 


Dependency  Directed  Reasoning 


16  Bibliography  293 

Wegbreif,  B.  1976,  "Constructive  Methods  In  Program  Verification",  Xerox  Palo  Alto 
Research  Center  CSL-76-2,  July  1  976l 

Wilczynski,  D.  1975  “A  Process  Elaboration  Formalism  for  Writing  and  Analyzing 
Programs",  U.  of  &  Cal.  Information  Sciences  Inst.,  ISI/RR-75-35. 

Winograd,  Terry  1973  "Breaking  the  Complexity  Barrier  Again"  Proceedings  of  the 
ACM  S/C  IK  S/GPLAN  Interface  Meeting,  Nov.  1973. 

Wulf,  W'.  A.  1974,  “AI  PHARD  Towards  a  Language  to  Support  Structured 
Programming",  Carnegie  Mellon  University  Dept,  of  Comp  Sci,  April  1974. 

Wulf,  W;  London,  R;  and  Shaw,  NL  1976,  "An  Introduction  to  the  Construction  and 
Verification  of  Alphard  Programs"  IEEE  Transactions  on  Software  Engineering, 
voL  SE-2,  no.  4,  December  1976,  pp  253-265. 

Yellowitz,  L  and  Duncan,  A.  1978,  "Data  Structures  and  Program  Correctness: 
Bridging  The  Gap",  Computer  Languages  vol  3.  pp  135-142. 

Yonezawa,  A  19’5,  "Meta  Evaluation  of  Actors  With  Side  Effects",  MIT/AI  Working 
Paper  101,  June  1975. 

Yonezawa,  Aki  1 976a  "Meta  Evaluation  for  Verification  and  Analysis  of  Actor 
Programs",  Draft  paper,  M.LT.  A.L  Lab. 

Yonezawa,  A.  1976a,  "Symbolic-Evaluation  As  An  Aid  To  Program  Synthesis", 
MIT/AI  Working  Paper  124,  April  197b 

Yonezawa,  A.  1976b,  "Symbolic  Evaluation  Using  Conceptual  Representations  For 
Programs  With  Side  Effects",  MIT/AI  Memo  399,  December  1976. 

Yonezawa,  A.  1977,  I’crificalion  and  Specification  Techniques  for  Parallel  Programs 
baed  on  Message  Pasing  Semantics  M.LT.  PhD.  December,  1977. 

Zilles,  S.  1975,  "Abstract  Specification  for  Data  Types",  IBM  Research  Laboratory, 
San  Jose  California,  1975. 


